首页 > 试题广场 >

学校图书馆共有 300 万册图书，想统计其中 Compute

[问答题]

学校图书馆共有 300 万册图书，想统计其中 Computer ， Science ，计算机，科学这几个词出现的次数，并按照自然年度分类，如 2016 年出版的书籍中这几个词各自出现的次数， 2015 年······依次类推。

是小毛吖

这些问答题是如何作答啊；是coding还是用文字表述呢

发表于 2017-08-21 15:10:24 回复(0)

牛客-68

1.首先将不同年份的书分别记录在不同的文件中，比如 2016.txt , 2015.txt

2.对每一年的图***录进行遍历，解析 HashMap<String, Integer> map 统计关键词和出现的次数

HashMap<Stirng, Integer> map = new HashMap<>();
map.put("Computer",0);
map.put("Science",0);
map.put("计算机",0);
map.put("科学",0);

public HashMap<String, Integer> calculateTimes(HashMap<String, Integer> map,String bookName){
    if(bookName.length() <= 0){
        return map;
    }
    
    if(bookName.contains("Computer")){
        map.put("Computer", map.get("Compute")++);
    }else if(bookName.contains("Science")){
      map.put("Computer", map.get("Science")++);  }else if(bookName.contains("计算机")){
                 map.put("计算机", map.get("计算机")++);
        }else if(bookName.contains("科学")){
                map.put("科学", map.get("科学")++);
        }
         return map;
}

发表于 2017-08-23 20:57:18 回复(2)

Jane201901021903521

books.stream().collect(Collectors.groupingBy(Book::getYear,Collectors.groupingBy(Book::getType,Collectors.counting())));

发表于 2019-01-02 19:27:13 回复(0)

echobird

发表于 2017-08-24 09:48:12 回复(0)

阿猫阿狗被占用

将每本书都存在hdfs里作为一个文件，文件名为时间（4位年份）+书的id+书的名称。

使用mapreduce进行运算，map输出为<日期，computer次数；science次数；计算机次数；科学次数>,reduce输出同样，不过作为value的字符串中的次数为总次数。代码如下：

public static class MyMapper extends Mapper<LongWritable,Text,Text,Text>{

private static Text outputKey = new Text();

private static Text outputValue = new Text();

@Override

protected void map(LongWritable key, Text value, Context context)

throws IOException, InterruptedException {

//得到hdfs文件名

String filename = ((FileSplit) context.getInputSplit()).getPath().getName();

String date = filename.substring(0, 4);

//分别统计computer，science，计算机，科学出现的次数

int computer = 0;

int science = 0;

int jisuanji = 0;

int kexue = 0;

String line = value.toString();

String[] words = line.split(" ");

for(String s:words){

if(s.equals("computer")) computer++;

if(s.equals("science")) science++;

if(s.equals("计算机")) jisuanji++;

if(s.equals("科学")) kexue++;

}

String outputVal = "" + computer + ";" + science + ";" + jisuanji + ";" + kexue;

outputKey.set(date);

outputValue.set(outputVal);

context.write(outputKey, outputValue);

}

public static class MyReducer extends Reducer<Text, Text, Text, Text> {

@Override

protected void reduce(Text key, Iterable<Text> values,Context context)

throws IOException, InterruptedException {

int allComputer = 0;

int allScience = 0;

int allJisuanji = 0;

int allKexue = 0;

for(Text value:values){

String val = value.toString();

String[] str = val.split(";");

allComputer += Integer.parseInt(str[0]);

allScience += Integer.parseInt(str[1]);

allJisuanji += Integer.parseInt(str[2]);

allKexue += Integer.parseInt(str[3]);

}

String finalVal = "" + allComputer + ";" + allScience + ";" + allJisuanji + ";" + allKexue;

context.write(key, new Text(finalVal));

}

编辑于 2017-08-17 20:44:43 回复(5)

飞将军

这不就是hadoop的WordCount示例程序吗，处理过程如下：

1. map任务处理

a) 读取文件内容，解析成key、value对。对输入文件的每一行，解析成key、value对。每一个键值对调用一个map函数。

b) 在map函数中可以编写自己的逻辑，对输入的key、value处理，转换成新的key、value输出。

c) 对输出的key、value进行分区。

d) 对不同分区的数据，按照key进行排序、分组。相同key的value放到一个集合中。

2. reduce任务处理

a) 对多个map任务的输出，按照不同的分区，通过网络copy到不同的reduce节点。

b) 对多个map任务的输出进行合并、排序。写reduce函数自己的逻辑，对输入的key、reduce处理，转换成新的key、value输出。

c) 把reduce的输出保存到文件中。

发表于 2017-08-25 16:00:23 回复(0)

高非凡

思路分析：

1、以HashMap来存储每个年份对应的次数，建立一个 HashMap<Integer,Double> map = new HashMapc<Integer,Double>();

2、新建一个Book类，其中主要有两个参数，①出版年份：Integer pressYear; ②书籍内容： String content;

3、将图书馆的每一本输依次输入程序进行计算,判断每本book的content中是否包含那几个词，如果包含，就将hash的值在上面加一

4、最后获取到这个map，想要哪一年的只用调用 map.get(year);即可出来。

public class Book{

private Integer pressYear;

String content;

}

public class TestMain（）{

}

发表于 2021-03-05 21:01:51 回复(0)

吴彦祖吴彦祖

Count《Computer ， Science ，计算机，科学》group by year

发表于 2020-09-20 21:53:23 回复(0)

小小霞霞

HashMap<Stirng, Integer> map = new HashMap<>();
map.put("Computer",0);
map.put("Science",0);
map.put("计算机",0);
map.put("科学",0);
 
public HashMap<String, Integer> calculateTimes(HashMap<String, Integer> map,String bookName){
    if(bookName.length() <= 0){
        return map;
    }
     
    if(bookName.contains("Computer")){
        map.put("Computer", map.get("Compute")++);
    }else if(bookName.contains("Science")){
      map.put("Computer", map.get("Science")++);  }else if(bookName.contains("计算机")){
                 map.put("计算机", map.get("计算机")++);
        }else if(bookName.contains("科学")){
                map.put("科学", map.get("科学")++);
        }
         return map;
}

发表于 2019-08-27 17:03:46 回复(0)

馒头22

public class Demo{
  static HashMap<String,Integer> map = new HashMap();
static{ map.put("Computer", 0);
        map.put("Science", 0);
        map.put("计算机", 0);
 public void check(File file){
       // 非法输入
        if (!file.exists()) {
            return;
        }

        InputStream input = null;
        Scanner scanner = null;
        try {
            // 用Scanner读入文件输入流
            input = new FileInputStream(file);
            scanner = new Scanner(input);
            scanner.useDelimiter("\n");
            // 对行的书名进行处理
            while (scanner.hasNext()) {
                calculatorTimes(map, scanner.next());
            }

        } catch (java.io.IOException e) {
            e.printStackTrace();
        }

}
 public void calculatorTimes(HashMap<String, Integer> map, String bookName) {
        // 非法输入
        if (bookName.length() == 0) {
            return;
        }

        if (bookName.contains("Computer")) {
            map.put("Computer", map.get("Computer")+1);
        } else if (bookName.contains("Science")) {
            map.put("Science", map.get("Science")+1);
        } else if (bookName.contains("计算机")) {
            map.put("计算机", map.get("计算机")+1);
        } else if (bookName.contains("科学")) {
            map.put("科学", map.get("科学") + 1);
        }
    } 


}

发表于 2019-07-12 15:37:02 回复(1)

欣欣向荣201710261909935

这个量级的数据不用分布式怕是不行

发表于 2019-04-04 13:19:06 回复(0)

吴佳力

struct Book{
    int nYear;
    long nComputer;
    long nScience;
    long n计算机;
    long n科学;
};

struct WordCount{
    long nComputer;
    long nScience;
    long n计算机;
    long n科学;
};



int main()

{

Book arrBooks[3000000] = ReadInData();

WordCount arrWordCount[2016];

for (int i=0; i<3000000; i++)

{ arrWordCount[arrBooks[i].nYrear].nComputer =arrBooks[i].nComputer;
    arrWordCount[arrBooks[i].nYear].nScience =arrBooks[i].nScience;
 arrWordCount[arrBooks[i].nYear].n计算机 =arrBooks[i].n计算机;
    arrWordCount[arrBooks[i].nYear].n科学 =arrBooks[i].n科学;
  }

// 输出结果

发表于 2019-03-21 23:24:01 回复(0)

风之歌-

import java.util.*;

public class Main{
    public static void main(String[] args){
        long num = 3000000;//图书总数
        Book books[] = new Books[num];
        long times = 0;//统计关键词出现次数
        for(int i=0;i<num;i++){
            if(books[i].hasKeyword()){
                 times++;
            }
        }
        System.out.println(times);
    }
}

class Book{
    private String year;//出版年份
    private String name;
    private boolean hasKeyword;//记录书名是否包含指定的关键词

    public Book(String year,String name){
        this();  this.year = year;
        this.name = name;
    }

    public boolean hasKeyword(){
         if(this.name.contains("Computer")){
            return true;
        }else if(this.name.contains("Science")){       return true;
      }else if(this.name.contains("计算机")){
            return true;
        }else if(this.name.contains("科学")){
            return true;
        }else{
            return false;
        }
    }

    public void setYear(String year){
        this.year = year;
    }       public String getYear(){
        return this.year;
     }

    public void setName(String name){
        this.name = name;
    }

    public String getName(){
        return this.name;
    }

}

发表于 2019-03-21 21:34:13 回复(0)

mybestwish


	HashMap<Stirng, Integer> map =newHashMap<>();



	map.put("Computer",0);



	map.put("Science",0);



	map.put("计算机",0);



	map.put("科学",0);



	 



	publicHashMap<String, Integer> calculateTimes(HashMap<String, Integer> map,String bookName){



	    if(bookName.length() <=0){



	        returnmap;



	    }



	     



	    if(bookName.contains("Computer")){



	        map.put("Computer", map.get("Compute")++);



	    }elseif(bookName.contains("Science")){



	     map.put("Computer", map.get("Science")++);  }elseif(bookName.contains("计算机")){



	                map.put("计算机", map.get("计算机")++);



	        }elseif(bookName.contains("科学")){



	                map.put("科学", map.get("科学")++);



	        }



	        returnmap;



	}