1

I want to enablement an IPFlow statistics

First column is serial number,second is phone number,third is upflow data,forth is downflow data.I want to run a mapreduce program that combines the upflow add the downflow data.If data isn't null I can run it out successful.

  1,1120487,10,20
  2,1120417,20,30
  3,1120427,30,40
  4,1120437,,50
    public class FlowMapper extends Mapper<LongWritable, Text, IntWritable,FlowBean> {
        IntWritable phone = new IntWritable();
        FlowBean flowbean= new FlowBean();
        @Override
        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            int arg1;
            int arg2;
            String[] arr = value.toString().split("\t");
            phone.set(Integer.parseInt(arr[1]));
    //        System.out.println(Arrays.toString(arr));
    //        if(!Character.isDigit(Integer.parseInt(arr[2]))){
    //            arg1 = 0;
    //            System.out.println("come in ");
    //        }else{
    //            arg1 =Integer.parseInt(arr[2]);
    //            System.out.println("is this in your think");
    //        }
    //        if(arr[3] == null){
    //            arg2 = 0;
    //        }else{
    //            arg2 =Integer.parseInt(arr[3]);
    //        }
    //        System.out.println(arg1);
    //        System.out.println(arg2);
            arg1 =Integer.parseInt(arr[2]);
            arg2 =Integer.parseInt(arr[3]);
            flowbean.set(arg1, arg2);
            context.write(phone,flowbean);
        }
   } 

As you can see,I tried in the comments section, but I failed.I want give number 0 when the data is

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
David_Guan
  • 11
  • 2
  • May I ask why mapreduce instead of Spark/Flink? – OneCricketeer Mar 22 '20 at 16:05
  • hello,I just starting to learn Hadoop,I want to understand MapReduce.And then I ran into this problem about how to get the empty value to become number 0 – David_Guan Mar 23 '20 at 11:31
  • Okay,sure. Spark follows the same distributed processing principles. My point is that I've been working in Hadoop over half a decade and nobody seriously writes (low level) mapreduce code. I would hate to see you waste your time on it – OneCricketeer Mar 23 '20 at 13:19
  • I also think that need not spend a lot of time on the ow level.Unfortunately,my company chose flink and hive.Do you have any recommended books or other resources for these two techniques – David_Guan Mar 23 '20 at 13:31
  • I've not learned Flink outside of the official docs. Hive wiki is great. I'm sure there's plenty of Manning / oreiky books or YouTubes out there... Just search, don't need to ask – OneCricketeer Mar 23 '20 at 14:54
  • Okay, one last question:Take Spark, for example,if I could do something use mapreduce ,do you think it is helpful to learn Spark? – David_Guan Mar 23 '20 at 22:51
  • Spark doesn't use MapReduce API. It has `map()` and `reduce()` functions which acheive very similar goals, though. Or, as mentioned, Hive works great with the data you have shown. So does Apache Pig. – OneCricketeer Mar 23 '20 at 23:41

1 Answers1

0

Your lines have no tabs, so this doesn't return an array of more than one item

value.toString().split("\t")

And no item in the array will ever be null, only an empty string


Change to split on commas or rewrite it all in Spark which has a builtin CSV reader

Either way, I suggest you learn how to use Junit

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • hello,I try to do what you said,the data become '7,1400487,,20'.I replaced the comment section code with this,` if(arr[2]=="" || arr[2]==null){ arg1 = 0; }` But it's still not right – David_Guan Mar 23 '20 at 11:34
  • It shouldn't become that. It would be an array. Again, use Junit to test your code. Use breakpoints to debug. Basically, please learn pure Java before writing MapReduce in it or anything else outside of the standard library for that matter https://stackoverflow.com/questions/513832/how-do-I-compare-strings-in-java if you want to use ==, write Mapreduce in Python or Scala – OneCricketeer Mar 23 '20 at 13:18
  • 1
    Thank you very much.I finally know how to solve this problem.Thanks for your enthusiasm and friendliness – David_Guan Mar 23 '20 at 13:28