كل الطرق المجربة فوق تعمل فقط يبدو الشروق غير موجود مرة يظهر ومرة يختفي عندهم مشاكل بالسرفر
كل الطرق المجربة فوق تعمل فقط يبدو الشروق غير موجود مرة يظهر ومرة يختفي عندهم مشاكل بالسرفر
بعد تتبع الاخطاء تمكنت من الحصول على نتيجة سوى ان حروفها ليست مفهومة.
هكذا :
xœì}ÙrW–à3ù×ДHµEì+7lËSÕm«<¶ª—©®@$€$™b g&DI튰J,†#&æibž:\*›-‰¦%ÙÅzŸ�û±¿dÎ9wÉ›@€iz¦h‹2ï~Ï=çܳ.¾õþ¯ß»ù/_gk~ÛfÿæÝõKÍe2ÿTx/“yÿæûìŸyó£Y.e7]£ãY¾åt;“¹~#ÅRk¾ßÏd666Ò…´ã®fn~’¹ƒmå°²ø8çk5Ó-¿•Zž^¤ï´íŽ·”ÐL®V«ñÚ)fÕ¥”Ù™ûͧ)¬1zвܥ”ëÛÐæ4ƒŸéÅ·ææ Ø¢×t®Ou{ƪ¹”ºeÜ6øÃóÜf¨Ó¶Ùê4vþe\sÕ2áwÇèÙ~¦m®Óõæ¦ofŒ”4ït N+}ËK1ÿnZöÍ;~Fk~™-føÇe67·<=µ¸f-üÛ0<“¹æJ¨w³¹æ¸NoÝéØVǤA®‘I±Öñ-ß6áÃTçèQÿuÿàhëèë?…?<Ù9Ú>z¯éÿÅŒ,¾Ø6}ƒa/sæg=ëöRê=§ã›î&:Åšü›=.ôk®®gúK=e®*º§f:Fæ¹jÂJ¸‹s¾ÕÖÛÈy¢4L`¹¦½”òLÃm®É%2º]ÛjRÝŒÓ5;ümËäËOß¾ƒ;=êÒ`ŸRéîZ:ÁI/¥† ÿO¦×6½5ÓôC»Ùô¼ÑG㯙mÓË�b½L×îZz×±í4µ]™Éwé[ÍuÓ=¯ÞZ-õ¶ÙéÉ
التعديل الأخير تم بواسطة apitos ; 04-06-2011 الساعة 05:58 PM
__________________
مجلتي الصغيرة
إليك هذه تجربة كاملة وهي ناجحة جدا جلب الإفتتاحية ووضعها في ملف نصي
كود PHP:
<?php
/**
* See <a href="http://www.bin-co.com/php/scripts/load/
" title="http://www.bin-co.com/php/scripts/load/
" rel="nofollow">http://www.bin-co.com/php/scripts/load/
</a> * Version : 1.00.A
*/
function load($url,$options=array('method'=>'get','return_info'=>false)) {
$url_parts = parse_url($url);
$info = array(//Currently only supported by curl.
'http_code' => 200
);
$response = '';
$send_header = array(
'Accept' => 'text/*',
'User-Agent' => 'BinGet/1.00.A (<a href="http://www.bin-co.com/php/scripts/load/" title="http://www.bin-co.com/php/scripts/load/" rel="nofollow">http://www.bin-co.com/php/scripts/load/</a>)'
);
///////////////////////////// Curl /////////////////////////////////////
//If curl is available, use curl to get the data.
if(function_exists("curl_init")
and (!(isset($options['use']) and $options['use'] == 'fsocketopen'))) { //Don't user curl if it is specifically stated to user fsocketopen in the options
if(isset($options['method']) and $options['method'] == 'post') {
$page = $url_parts['scheme'] . '://' . $url_parts['host'] . $url_parts['path'];
} else {
$page = $url;
}
$ch = curl_init($url_parts['host']);
curl_setopt($ch, CURLOPT_URL, $page);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //Just return the data - not print the whole thing.
curl_setopt($ch, CURLOPT_HEADER, true); //We need the headers
curl_setopt($ch, CURLOPT_NOBODY, false); //The content - if true, will not download the contents
if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $url_parts['query']);
}
//Set the headers our spiders sends
curl_setopt($ch, CURLOPT_USERAGENT, $send_header['User-Agent']); //The Name of the UserAgent we will be using ;)
$custom_headers = array("Accept: " . $send_header['Accept'] );
if(isset($options['modified_since']))
array_push($custom_headers,"If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',strtotime($options['modified_since'])));
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt"); //If ever needed...
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
if(isset($url_parts['user']) and isset($url_parts['pass'])) {
$custom_headers = array("Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']));
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
}
$response = curl_exec($ch);
$info = curl_getinfo($ch); //Some information on the fetch
curl_close($ch);
//////////////////////////////////////////// FSockOpen //////////////////////////////
} else { //If there is no curl, use fsocketopen
if(isset($url_parts['query'])) {
if(isset($options['method']) and $options['method'] == 'post')
$page = $url_parts['path'];
else
$page = $url_parts['path'] . '?' . $url_parts['query'];
} else {
$page = $url_parts['path'];
}
$fp = fsockopen($url_parts['host'], 80, $errno, $errstr, 30);
if ($fp) {
$out = '';
if(isset($options['method']) and $options['method'] == 'post' and isset($url_parts['query'])) {
$out .= "POST $page HTTP/1.1\r\n";
} else {
$out .= "GET $page HTTP/1.0\r\n"; //HTTP/1.0 is much easier to handle than HTTP/1.1
}
$out .= "Host: $url_parts[host]\r\n";
$out .= "Accept: $send_header[Accept]\r\n";
$out .= "User-Agent: {$send_header['User-Agent']}\r\n";
if(isset($options['modified_since']))
$out .= "If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',strtotime($options['modified_since'])) ."\r\n";
$out .= "Connection: Close\r\n";
//HTTP Basic Authorization support
if(isset($url_parts['user']) and isset($url_parts['pass'])) {
$out .= "Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']) . "\r\n";
}
//If the request is post - pass the data in a special way.
if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
$out .= "(anti-spam-(anti-spam-(anti-spam-(anti-spam-(anti-spam-content-type:))))) application/x-www-form-urlencoded\r\n";
$out .= 'Content-Length: ' . strlen($url_parts['query']) . "\r\n";
$out .= "\r\n" . $url_parts['query'];
}
$out .= "\r\n";
fwrite($fp, $out);
while (!feof($fp)) {
$response .= fgets($fp, 128);
}
fclose($fp);
}
}
//Get the headers in an associative array
$headers = array();
if($info['http_code'] == 404) {
$body = "";
$headers['Status'] = 404;
} else {
//Seperate header and content
$separator_position = strpos($response,"\r\n\r\n");
$header_text = substr($response,0,$separator_position);
$body = substr($response,$separator_position+4);
foreach(explode("\n",$header_text) as $line) {
$parts = explode(": ",$line);
if(count($parts) == 2) $headers[$parts[0]] = chop($parts[1]);
}
}
if($options['return_info']) return array('headers' => $headers, 'body' => $body, 'info' => $info);
return $body;
}
//$contents = load('http://www.echoroukonline.com/ara/national/76746.html');
function between($string, $start, $end)
{
$out = explode($start, $string);
if(isset($out[1]))
{
$string = explode($end, $out[1]);
return $string[0];
}
return '';
}
/*
function get_top()
{
$open = load('http://www.echoroukonline.com/ara/national/76746.html');
$code = between($open, '<div id="meme_rubrique_int">', '</div>');
return $code;
}
echo get_top();
*/
function get_top()
{
$open = load('http://www.echoroukonline.com/ara/editorial/index.1.html');
$code = between($open, '<div id="category_right_col_int">', '<div id="box_pagination">');
return $code;
}
//echo get_top();
$key = get_top().''. "\r\n";
$sitemap_file = fopen('echoroukonline.txt','w+');
fwrite($sitemap_file, $key);
fclose($sitemap_file);
?>
التعديل الأخير تم بواسطة zamile28 ; 05-06-2011 الساعة 02:52 PM
اهلا اخي zamile28.
الكود يعمل ولكنه يجلب كل الافتتاحيات وليس الافتتاحية الاخيرة فقط.
قمت بتجريب عدة اكواد واستقرت تجاربي على هذا الكود.
ولكن فيه مشكلتين :
1 - حين استعمل طريقة العرض هاته لا احصل على شيء :
2 - وحين استعمل هذه الطريقة احصل على نص ولكن الحروف غير مفهومة بتاتا :كود PHP:
echo '
<div class="unit">
<a href="{$sUrl2}">{$aTitles}</a>
<div>{$aMetas}</div>
<div>{$aDescriptions}</div>
</div>';
echo '</div>';
الكود كاملا :كود PHP:
echo "<br />".$aTitles."<br />".$aMetas."<br />".$aDescriptions."<br />";
شكراً.كود PHP:
<?php
set_time_limit(0);
$sUrl = 'http://www.echoroukonline.com/ara/editorial/index.1.html';
$sUrlSrc = getWebsiteContent($sUrl,0);
// Load the source
$dom = new DOMDocument("UTF-8");
@$dom->loadHTML($sUrlSrc);
$xpath = new DomXPath($dom);
// =================================== step 1 - links:
$vRes = $xpath->query("/html/body/div/div[2]/div/div[2]/div[4]/div/div/div/h2/a");
// =================================== step 2 - titles:
$aLinks = $vRes->item(0)->getAttribute("href");
echo "<br />aLinks : ".$aLinks."<br />";
$sUrl2 = 'http://www.echoroukonline.com/ara/'.$aLinks;
echo "<br />sUrl2 : ".$sUrl2."<br />";
$sUrlSrc2 = getWebsiteContent($sUrl2,1);
@$dom->loadHTML($sUrlSrc2);
$xpath = new DomXPath($dom);
// =================================== step 3 - titles:
$vRes = $xpath->query(".//*[@id='article_holder']/h1");
$aTitles= $vRes->item(0)->nodeValue;
// =================================== step 4 - Metas:
$vRes = $xpath->query(".//*[@class='article_metadata']");
$aMetas= $vRes->item(0)->nodeValue;
//==================================== step 5 - descriptions:
$vRes = $xpath->query(utf8_encode(".//*[@id='article_body']"));
$aDescriptions= $vRes->item(0)->nodeValue;
//=============================
echo '<link href="css/styles.css" type="text/css" rel="stylesheet"/><div class="main">';
echo '<h1>Using xpath for dom html</h1>';
//echo "<br />".$aTitles."<br />".$aMetas."<br />".$aDescriptions."<br />";
echo '
<div class="unit">
<a href="{$sUrl2}">{$aTitles}</a>
<div>{$aMetas}</div>
<div>{$aDescriptions}</div>
</div>';
echo '</div>';
// this function will return page content using caches (we will load original sources not more than once per hour)
function getWebsiteContent($sUrl,$f=0) {
// our folder with cache files
$sCacheFolder = 'cache/';
// cache filename
if ($f==0) {
$sFilename = 'ech-'.date('YmdHi').'.html';
} else {
$sFilename = 'eftch-'.date('YmdHi').'.html';
}
if (! file_exists($sCacheFolder.$sFilename)) {
$ch = curl_init($sUrl);
$fp = fopen($sCacheFolder.$sFilename, 'w');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_HTTPHEADER, Array('User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15'));
curl_exec($ch);
curl_close($ch);
fclose($fp);
}
//return file_get_contents($sCacheFolder.$sFilename);
return file_get_contents_utf8($sCacheFolder.$sFilename);
}
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'UTF-8',
mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
?>
__________________
مجلتي الصغيرة
رغبة في مساعدتك قمت بعدة تجارب والنتيجة النهائية هذه
تعرض اي جزء من الإفتتاحية
كود PHP:
<?php
/**
* See <a href="http://www.bin-co.com/php/scripts/load/
" title="http://www.bin-co.com/php/scripts/load/
" rel="nofollow">http://www.bin-co.com/php/scripts/load/
</a> * Version : 1.00.A
*/
function load($url,$options=array('method'=>'get','return_info'=>false)) {
$url_parts = parse_url($url);
$info = array(//Currently only supported by curl.
'http_code' => 200
);
$response = '';
$send_header = array(
'Accept' => 'text/*',
'User-Agent' => 'BinGet/1.00.A (<a href="http://www.bin-co.com/php/scripts/load/" title="http://www.bin-co.com/php/scripts/load/" rel="nofollow">http://www.bin-co.com/php/scripts/load/</a>)'
);
///////////////////////////// Curl /////////////////////////////////////
//If curl is available, use curl to get the data.
/*
if(function_exists("curl_init")
and (!(isset($options['use']) and $options['use'] == 'fsocketopen'))) { //Don't user curl if it is specifically stated to user fsocketopen in the options
if(isset($options['method']) and $options['method'] == 'post') {
$page = $url_parts['scheme'] . '://' . $url_parts['host'] . $url_parts['path'];
} else {
$page = $url;
}
$ch = curl_init($url_parts['host']);
curl_setopt($ch, CURLOPT_URL, $page);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //Just return the data - not print the whole thing.
curl_setopt($ch, CURLOPT_HEADER, true); //We need the headers
curl_setopt($ch, CURLOPT_NOBODY, false); //The content - if true, will not download the contents
if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $url_parts['query']);
}
//Set the headers our spiders sends
curl_setopt($ch, CURLOPT_USERAGENT, $send_header['User-Agent']); //The Name of the UserAgent we will be using ;)
$custom_headers = array("Accept: " . $send_header['Accept'] );
if(isset($options['modified_since']))
array_push($custom_headers,"If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',strtotime($options['modified_since'])));
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt"); //If ever needed...
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
if(isset($url_parts['user']) and isset($url_parts['pass'])) {
$custom_headers = array("Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']));
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
}
$response = curl_exec($ch);
$info = curl_getinfo($ch); //Some information on the fetch
curl_close($ch);
//////////////////////////////////////////// FSockOpen //////////////////////////////
} else { //If there is no curl, use fsocketopen
*/
if(isset($url_parts['query'])) {
if(isset($options['method']) and $options['method'] == 'post')
$page = $url_parts['path'];
else
$page = $url_parts['path'] . '?' . $url_parts['query'];
} else {
$page = $url_parts['path'];
}
$fp = fsockopen($url_parts['host'], 80, $errno, $errstr, 30);
// $fp = fopen($url_parts['host'], 80, $errno, $errstr, 30);
if ($fp) {
$out = '';
if(isset($options['method']) and $options['method'] == 'post' and isset($url_parts['query'])) {
$out .= "POST $page HTTP/1.1\r\n";
} else {
$out .= "GET $page HTTP/1.0\r\n"; //HTTP/1.0 is much easier to handle than HTTP/1.1
}
$out .= "Host: $url_parts[host]\r\n";
$out .= "Accept: $send_header[Accept]\r\n";
$out .= "User-Agent: {$send_header['User-Agent']}\r\n";
if(isset($options['modified_since']))
$out .= "If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',strtotime($options['modified_since'])) ."\r\n";
$out .= "Connection: Close\r\n";
//HTTP Basic Authorization support
if(isset($url_parts['user']) and isset($url_parts['pass'])) {
$out .= "Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']) . "\r\n";
}
//If the request is post - pass the data in a special way.
if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
$out .= "(anti-spam-(anti-spam-(anti-spam-content-type:))) application/x-www-form-urlencoded\r\n";
$out .= 'Content-Length: ' . strlen($url_parts['query']) . "\r\n";
$out .= "\r\n" . $url_parts['query'];
}
$out .= "\r\n";
fwrite($fp, $out);
while (!feof($fp)) {
$response .= fgets($fp, 128);
}
fclose($fp);
}
// }
//Get the headers in an associative array
$headers = array();
if($info['http_code'] == 404) {
$body = "";
$headers['Status'] = 404;
} else {
//Seperate header and content
$separator_position = strpos($response,"\r\n\r\n");
$header_text = substr($response,0,$separator_position);
$body = substr($response,$separator_position+4);
foreach(explode("\n",$header_text) as $line) {
$parts = explode(": ",$line);
if(count($parts) == 2) $headers[$parts[0]] = chop($parts[1]);
}
}
if($options['return_info']) return array('headers' => $headers, 'body' => $body, 'info' => $info);
return $body;
}
//$contents = load('http://www.echoroukonline.com/ara/national/76746.html');
function between($string, $start, $end)
{
$out = explode($start, $string);
if(isset($out[1]))
{
$string = explode($end, $out[1]);
return $string[0];
}
return '';
}
/*
function get_top()
{
$open = load('http://www.echoroukonline.com/ara/national/76746.html');
$code = between($open, '<div id="meme_rubrique_int">', '</div>');
return $code;
}
echo get_top();
*/
/*
function get_top()
{
$open = load('http://www.echoroukonline.com/ara/editorial/index.1.html');
$code = between($open, '<div id="category_right_col_int">', '<div id="box_pagination">');
return $code;
}
*/
//echo get_top();
$open = load('http://www.echoroukonline.com/ara/editorial/index.1.html');
// $result= file_get_contents('http://www.echoroukonline.com/ara/editorial/index.1.html');
preg_match_all('#<div class="short_holder_rubrique">(.+?)</div>#smi', $open,$echoroukonline);
$editorial1=strip_tags($echoroukonline[1][0]);
$editorial2=strip_tags($echoroukonline[1][1]);
$editorial3=strip_tags($echoroukonline[1][2]);
$editorial4=strip_tags($echoroukonline[1][3]);
echo"<br>";
echo $editorial1;
echo"<br>";
echo $editorial2;
echo"<br>";
echo $editorial3;
echo"<br>";
echo $editorial4;
echo"<br>";
/*
$key = get_top().''. "\r\n";
$sitemap_file = fopen('echoroukonline.txt','w+');
fwrite($sitemap_file, $key);
fclose($sitemap_file);
*/
// ### Checks for presence of the cURL extension.
//function _iscurlinstalled() {
// if (in_array ('curl', get_loaded_extensions())) {
// return true;
// }
// else{
// return false;
// }
//}
//Sample usage:
//if (_iscurlinstalled()) echo "cURL is installed"; else echo "cURL is NOT installed";
?>
اعرض اي خبر بالطريقة هذه حسب الترتيب اطن هناك 11 يعني الاخيرة
كود PHP:
$editorial11=strip_tags($echoroukonline[1][11]);
خبرني إذا نجحت
التعديل الأخير تم بواسطة zamile28 ; 08-06-2011 الساعة 10:14 PM
اهلا اخي زميل وبارك الله فيك على مد يد المساعدة،
هناك نتائج ولكن اظن ان الدالة fsockopen ليست متوفرة في كل السيرفرات.
كما اني اريد محتوى آخر افتتاحية تم اضافتها.
وحتى آخر كود قمت بوضعه يعمل جيدا الا ان فيه مشكلة ظهور الحروف العربية كما اسلفت.
التعديل الأخير تم بواسطة apitos ; 09-06-2011 الساعة 02:48 PM
__________________
مجلتي الصغيرة
الطريقة التي اعطيتك فوق تعمل ب curl و fsockopen
فقط إفتح اعلي الاكواد /* ليعمل curl اغلقتها لان لا تعمل عندي
جرب هذه خفيفة وسريعة في جلب اخر مقال
للعلم احيانا هماك تسع مقالات في الإفتتاحية واحيانا هناك عشرة فقط غير الرقم حسب العدد ارجو انك فهمت
كود PHP:
<?php
header("(anti-spam-content-type:) text/html; charset=utf-8");
function getHtmlCodeViaCurl($url){
$userAgent=array();
$userAgent[]="Opera/9.50 (Windows NT 5.1; U; en)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Version/3.1 Safari/525.13";
$userAgent[]="Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FDM; MEGAUPLOAD 1.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12";
$userAgent[]="Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FDM; MEGAUPLOAD 1.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en) AppleWebKit/522.15.5 (KHTML, like Gecko) Version/3.0.3 Safari/522.15.5";
$total1=count($userAgent)-1;
$rand1=rand(0,$total1);
$curl = curl_init() or die("FATAL ERROR: cURL support is not found on this server.");
curl_setopt($curl, CURLOPT_USERAGENT, $userAgent[$rand1]);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 20);
return curl_exec($curl);
}
$url = 'http://www.echoroukonline.com/ara/editorial/index.1.html';
$open = getHtmlCodeViaCurl($url);
preg_match_all('#<div class="short_holder_rubrique">(.+?)</div>#smi', $open,$echoroukonline);
$editorial9=strip_tags($echoroukonline[1][9]);
echo"<br>";
echo $editorial9;
echo"<br>";
?>
التعديل الأخير تم بواسطة zamile28 ; 15-06-2011 الساعة 02:59 AM سبب آخر: تغير في العمل
اهلا اخي زميل28،
هذا الرابط يحوي قاءمة مختصرة لمواضيع الافتتاحيات :
http://www.echoroukonline.com/ara/editorial/index.1.html
ولكن اريد الحصول على الموضوع كاملا لاخر افتتاحية تم اضافتها ويكون رابطها هكذا :
http://www.echoroukonline.com/ara/editorial/77456.html
شكراً.
__________________
مجلتي الصغيرة
ساعطيك المفتاح والباقي عليك
هذه لجلب اخر مقال بالإفتتاحية
غير عدل كما تشاء
وهذه لجلب المقال كاملاكود PHP:
<?php
header("(anti-spam-content-type:) text/html; charset=utf-8");
function load($url,$options=array('method'=>'get','return_info'=>false)) {
$url_parts = parse_url($url);
$info = array(//Currently only supported by curl.
'http_code' => 200
);
$response = '';
$send_header = array(
'Accept' => 'text/*',
'User-Agent' => 'BinGet/1.00.A (<a href="http://www.bin-co.com/php/scripts/load/" title="http://www.bin-co.com/php/scripts/load/" rel="nofollow">http://www.bin-co.com/php/scripts/load/</a>)'
);
///////////////////////////// Curl /////////////////////////////////////
//If curl is available, use curl to get the data.
if(function_exists("curl_init")
and (!(isset($options['use']) and $options['use'] == 'fsocketopen'))) { //Don't user curl if it is specifically stated to user fsocketopen in the options
if(isset($options['method']) and $options['method'] == 'post') {
$page = $url_parts['scheme'] . '://' . $url_parts['host'] . $url_parts['path'];
} else {
$page = $url;
}
$ch = curl_init($url_parts['host']);
curl_setopt($ch, CURLOPT_URL, $page);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //Just return the data - not print the whole thing.
curl_setopt($ch, CURLOPT_HEADER, true); //We need the headers
curl_setopt($ch, CURLOPT_NOBODY, false); //The content - if true, will not download the contents
if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $url_parts['query']);
}
//Set the headers our spiders sends
curl_setopt($ch, CURLOPT_USERAGENT, $send_header['User-Agent']); //The Name of the UserAgent we will be using ;)
$custom_headers = array("Accept: " . $send_header['Accept'] );
if(isset($options['modified_since']))
array_push($custom_headers,"If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',strtotime($options['modified_since'])));
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt"); //If ever needed...
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
if(isset($url_parts['user']) and isset($url_parts['pass'])) {
$custom_headers = array("Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']));
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
}
$response = curl_exec($ch);
$info = curl_getinfo($ch); //Some information on the fetch
curl_close($ch);
//////////////////////////////////////////// FSockOpen //////////////////////////////
} else { //If there is no curl, use fsocketopen
if(isset($url_parts['query'])) {
if(isset($options['method']) and $options['method'] == 'post')
$page = $url_parts['path'];
else
$page = $url_parts['path'] . '?' . $url_parts['query'];
} else {
$page = $url_parts['path'];
}
$fp = fsockopen($url_parts['host'], 80, $errno, $errstr, 30);
// $fp = fopen($url_parts['host'], 80, $errno, $errstr, 30);
if ($fp) {
$out = '';
if(isset($options['method']) and $options['method'] == 'post' and isset($url_parts['query'])) {
$out .= "POST $page HTTP/1.1\r\n";
} else {
$out .= "GET $page HTTP/1.0\r\n"; //HTTP/1.0 is much easier to handle than HTTP/1.1
}
$out .= "Host: $url_parts[host]\r\n";
$out .= "Accept: $send_header[Accept]\r\n";
$out .= "User-Agent: {$send_header['User-Agent']}\r\n";
if(isset($options['modified_since']))
$out .= "If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',strtotime($options['modified_since'])) ."\r\n";
$out .= "Connection: Close\r\n";
//HTTP Basic Authorization support
if(isset($url_parts['user']) and isset($url_parts['pass'])) {
$out .= "Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']) . "\r\n";
}
//If the request is post - pass the data in a special way.
if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
$out .= "(anti-spam-content-type:) application/x-www-form-urlencoded\r\n";
$out .= 'Content-Length: ' . strlen($url_parts['query']) . "\r\n";
$out .= "\r\n" . $url_parts['query'];
}
$out .= "\r\n";
fwrite($fp, $out);
while (!feof($fp)) {
$response .= fgets($fp, 128);
}
fclose($fp);
}
}
//Get the headers in an associative array
$headers = array();
if($info['http_code'] == 404) {
$body = "";
$headers['Status'] = 404;
} else {
//Seperate header and content
$separator_position = strpos($response,"\r\n\r\n");
$header_text = substr($response,0,$separator_position);
$body = substr($response,$separator_position+4);
foreach(explode("\n",$header_text) as $line) {
$parts = explode(": ",$line);
if(count($parts) == 2) $headers[$parts[0]] = chop($parts[1]);
}
}
if($options['return_info']) return array('headers' => $headers, 'body' => $body, 'info' => $info);
return $body;
}
function getHtmlCodeViaCurl($url){
$userAgent=array();
$userAgent[]="Opera/9.50 (Windows NT 5.1; U; en)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Version/3.1 Safari/525.13";
$userAgent[]="Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FDM; MEGAUPLOAD 1.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12";
$userAgent[]="Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FDM; MEGAUPLOAD 1.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en) AppleWebKit/522.15.5 (KHTML, like Gecko) Version/3.0.3 Safari/522.15.5";
$total1=count($userAgent)-1;
$rand1=rand(0,$total1);
$curl = curl_init() or die("FATAL ERROR: cURL support is not found on this server.");
curl_setopt($curl, CURLOPT_USERAGENT, $userAgent[$rand1]);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 20);
return curl_exec($curl);
}
$url = 'http://www.echoroukonline.com/ara/editorial/index.1.html';
//$open = getHtmlCodeViaCurl($url);
$open = load($url);
//preg_match_all('#<div class="short_holder_rubrique">(.+?)</div>#smi', $open,$echoroukonline);
//$editorial9=strip_tags($echoroukonline[1][9]);
preg_match_all('#<span class="summary">(.+?)</span>#smi', $open,$echoroukonline);
$editorial9=strip_tags($echoroukonline[1][10]);
preg_match_all('#<span class="visit">(.+?)</span>#smi', $open,$echoroukonline);
$editorialurl=$echoroukonline[1][9];
preg_match("~<a href=\"(.*?)\">~i", $editorialurl, $epiurl);
$editoriallink =$epiurl[1];
echo"<br>";
echo $editorial9;
echo"<br>";
echo $editorialurl;
echo"<br>";
echo "<a href='curl-echorouk2.php?id=".$editoriallink."'>المزيد</a>";
echo"<br>";
?>
كود PHP:
<?php
header("(anti-spam-content-type:) text/html; charset=utf-8");
function load($url,$options=array('method'=>'get','return_info'=>false)) {
$url_parts = parse_url($url);
$info = array(//Currently only supported by curl.
'http_code' => 200
);
$response = '';
$send_header = array(
'Accept' => 'text/*',
'User-Agent' => 'BinGet/1.00.A (<a href="http://www.bin-co.com/php/scripts/load/" title="http://www.bin-co.com/php/scripts/load/" rel="nofollow">http://www.bin-co.com/php/scripts/load/</a>)'
);
///////////////////////////// Curl /////////////////////////////////////
//If curl is available, use curl to get the data.
if(function_exists("curl_init")
and (!(isset($options['use']) and $options['use'] == 'fsocketopen'))) { //Don't user curl if it is specifically stated to user fsocketopen in the options
if(isset($options['method']) and $options['method'] == 'post') {
$page = $url_parts['scheme'] . '://' . $url_parts['host'] . $url_parts['path'];
} else {
$page = $url;
}
$ch = curl_init($url_parts['host']);
curl_setopt($ch, CURLOPT_URL, $page);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //Just return the data - not print the whole thing.
curl_setopt($ch, CURLOPT_HEADER, true); //We need the headers
curl_setopt($ch, CURLOPT_NOBODY, false); //The content - if true, will not download the contents
if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $url_parts['query']);
}
//Set the headers our spiders sends
curl_setopt($ch, CURLOPT_USERAGENT, $send_header['User-Agent']); //The Name of the UserAgent we will be using ;)
$custom_headers = array("Accept: " . $send_header['Accept'] );
if(isset($options['modified_since']))
array_push($custom_headers,"If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',strtotime($options['modified_since'])));
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt"); //If ever needed...
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
if(isset($url_parts['user']) and isset($url_parts['pass'])) {
$custom_headers = array("Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']));
curl_setopt($ch, CURLOPT_HTTPHEADER, $custom_headers);
}
$response = curl_exec($ch);
$info = curl_getinfo($ch); //Some information on the fetch
curl_close($ch);
//////////////////////////////////////////// FSockOpen //////////////////////////////
} else { //If there is no curl, use fsocketopen
if(isset($url_parts['query'])) {
if(isset($options['method']) and $options['method'] == 'post')
$page = $url_parts['path'];
else
$page = $url_parts['path'] . '?' . $url_parts['query'];
} else {
$page = $url_parts['path'];
}
$fp = fsockopen($url_parts['host'], 80, $errno, $errstr, 30);
// $fp = fopen($url_parts['host'], 80, $errno, $errstr, 30);
if ($fp) {
$out = '';
if(isset($options['method']) and $options['method'] == 'post' and isset($url_parts['query'])) {
$out .= "POST $page HTTP/1.1\r\n";
} else {
$out .= "GET $page HTTP/1.0\r\n"; //HTTP/1.0 is much easier to handle than HTTP/1.1
}
$out .= "Host: $url_parts[host]\r\n";
$out .= "Accept: $send_header[Accept]\r\n";
$out .= "User-Agent: {$send_header['User-Agent']}\r\n";
if(isset($options['modified_since']))
$out .= "If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',strtotime($options['modified_since'])) ."\r\n";
$out .= "Connection: Close\r\n";
//HTTP Basic Authorization support
if(isset($url_parts['user']) and isset($url_parts['pass'])) {
$out .= "Authorization: Basic ".base64_encode($url_parts['user'].':'.$url_parts['pass']) . "\r\n";
}
//If the request is post - pass the data in a special way.
if(isset($options['method']) and $options['method'] == 'post' and $url_parts['query']) {
$out .= "(anti-spam-content-type:) application/x-www-form-urlencoded\r\n";
$out .= 'Content-Length: ' . strlen($url_parts['query']) . "\r\n";
$out .= "\r\n" . $url_parts['query'];
}
$out .= "\r\n";
fwrite($fp, $out);
while (!feof($fp)) {
$response .= fgets($fp, 128);
}
fclose($fp);
}
}
//Get the headers in an associative array
$headers = array();
if($info['http_code'] == 404) {
$body = "";
$headers['Status'] = 404;
} else {
//Seperate header and content
$separator_position = strpos($response,"\r\n\r\n");
$header_text = substr($response,0,$separator_position);
$body = substr($response,$separator_position+4);
foreach(explode("\n",$header_text) as $line) {
$parts = explode(": ",$line);
if(count($parts) == 2) $headers[$parts[0]] = chop($parts[1]);
}
}
if($options['return_info']) return array('headers' => $headers, 'body' => $body, 'info' => $info);
return $body;
}
function getHtmlCodeViaCurl($url){
$userAgent=array();
$userAgent[]="Opera/9.50 (Windows NT 5.1; U; en)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Version/3.1 Safari/525.13";
$userAgent[]="Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FDM; MEGAUPLOAD 1.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12";
$userAgent[]="Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FDM; MEGAUPLOAD 1.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)";
$userAgent[]="Mozilla/5.0 (Windows; U; Windows NT 5.1; en) AppleWebKit/522.15.5 (KHTML, like Gecko) Version/3.0.3 Safari/522.15.5";
$total1=count($userAgent)-1;
$rand1=rand(0,$total1);
$curl = curl_init() or die("FATAL ERROR: cURL support is not found on this server.");
curl_setopt($curl, CURLOPT_USERAGENT, $userAgent[$rand1]);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 20);
return curl_exec($curl);
}
$id=$_GET["id"];
$url = "http://www.echoroukonline.com/ara/".$id."";
//$open = getHtmlCodeViaCurl($url);
$open = load($url);
//preg_match_all('#<div class="short_holder_rubrique">(.+?)</div>#smi', $open,$echoroukonline);
//$editorial9=strip_tags($echoroukonline[1][9]);
//echo $url;
preg_match('#<div id="article_body">(.+?)</div>#smi', $open,$echoroukonline);
$editorial9=strip_tags($echoroukonline[1]);
echo"<br>";
echo $editorial9;
echo"<br>";
?>
تنبيه يجب ان يكون لديك سيرفر محلي انصحك ب appserv
فهو الافضل وشبيه بسيرفر حقيقي.
فعل curl
قم بعدة تجارب ضع مخطط واعمل عليه
حسنا اخي زميل.
شكرا لوقفتك معي وجزاك الله خيراً.
__________________
مجلتي الصغيرة