首页 > 试题广场 >

处理以下文件内容,将域名取出并进行计数排序。

[问答题]

处理以下文件内容,将域名取出并进行计数排序,如处理: http://www.baidu.com/index.html http://www.baidu.com/1.html http://post.baidu.com/index.html
http://mp3.baidu.com/index.html
http://www.baidu.com/3.html
http://post.baidu.com/2.html
得到如下结果: 域名的出现的次数域名
3 www.baidu.com
2 post.baidu.com
1 mp3.baidu.com
可以使用bash/perl/php/c任意一种

dee

不需要用正则
php就是为web而生的，url自然有对应的函数处理：
<?php
$sites = [
    'http://www.baidu.com/index.html',
    'http://www.baidu.com/1.html',
    'http://post.baidu.com/index.html',
    'http://mp3.baidu.com/index.html',
    'http://www.baidu.com/3.html',
    'http://post.baidu.com/2.html'
];
$tmp = [];
function callback($v, $k) {
    global $tmp;
    $tmp[] = parse_url($v, PHP_URL_HOST);
}
array_walk($sites, 'callback');
$data = array_count_values($tmp);
arsort($data);

编辑于 2016-07-06 21:50:15 回复(0)

小狼狗花园

cat test | awk '{for(i=1;i<=NF;i++) print $i}' | cut -d'/' -f 3 | sort | uniq -ic| sort -nr

发表于 2015-01-01 17:57:27 回复(2)

疯子好好活

<?php
$subject = <<< EOF
http://www.baidu.com/index.html http://www.baidu.com/1.html
http://post.baidu.com/index.html
http://mp3.baidu.com/index.html 
http://www.baidu.com/3.html 
http://post.baidu.com/2.html
EOF;
preg_match_all('|http://(.*)/|U', $subject, $matches);
$res = array_count_values($matches[1]);
foreach ($res as $key => $value) {
 echo $value." ".$key."\n";
}

发表于 2015-01-04 01:20:26 回复(0)