mam taki problem. Troche juz na ten temat czytalem choc wyraznie jeszcze za malo i chcialem poeksperymentowac na wlasna reke.
Utworzylem na serwerze plik index.php
CODE
function grab($url) {
$user_agent = "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)";
$ch = curl_init();
$headadd=array("Accept: image/gif","Accept-Language: pl");
curl_setopt($ch, CURLOPT_URL, $url);
if ($ip!=""){
curl_setopt($ch, CURLOPT_INTERFACE);
}
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headadd);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 3);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
$result = curl_exec($ch);
$info = curl_getinfo($ch);
$code = $info['http_code'];
return $result;
}
$result = grab('http://encyklopedia.interia.pl');
preg_match_all("/onMouseOut=\"window.status='\'; return true;\"
class=\"ht\">([A-z0-9 \-]+)<\/a>/",$result,$domena);
print_r($result);
?>
$user_agent = "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)";
$ch = curl_init();
$headadd=array("Accept: image/gif","Accept-Language: pl");
curl_setopt($ch, CURLOPT_URL, $url);
if ($ip!=""){
curl_setopt($ch, CURLOPT_INTERFACE);
}
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headadd);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 3);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
$result = curl_exec($ch);
$info = curl_getinfo($ch);
$code = $info['http_code'];
return $result;
}
$result = grab('http://encyklopedia.interia.pl');
preg_match_all("/onMouseOut=\"window.status='\'; return true;\"
class=\"ht\">([A-z0-9 \-]+)<\/a>/",$result,$domena);
print_r($result);
?>
oraz .htaccess
CODE
Options +FollowSymLinks
RewriteEngine On
RewriteRule ^index.html$ index.php [L]
RewriteRule ^haslo-([^-]+)-([^-]+).html$ haslo?hid=$1&kid=$2 [L]
RewriteRule ^(.*).html$ litery?l=$1 [L]
RewriteRule ^(.*).html$ katalog?kid=$1 [L]
RewriteRule ^pk([^-]+)ih([^-]+)kid([^-]+).html$ katalog?kid=$3&pk=$1&ih=$2 [L]
RewriteEngine On
RewriteRule ^index.html$ index.php [L]
RewriteRule ^haslo-([^-]+)-([^-]+).html$ haslo?hid=$1&kid=$2 [L]
RewriteRule ^(.*).html$ litery?l=$1 [L]
RewriteRule ^(.*).html$ katalog?kid=$1 [L]
RewriteRule ^pk([^-]+)ih([^-]+)kid([^-]+).html$ katalog?kid=$3&pk=$1&ih=$2 [L]
zas wyniki nie sa zachwycajace dla mnie.
chodzi mi o to ze chcialbym wyciagnac tylko ciag takich znakow:
A B C D E F G H I J K L Ł M N O P R S T U W Z
z dzialajacymi linkami.
czy uzaleznione to jest od ciagu znakow w preg_match_all ?