Sunday, November 24, 2019

How To Honour Highest Repeating Give-And-Take From A Text File Inwards Coffee - Give-And-Take Count Problem

How to detect the give-and-take too their count from a text file is precisely about other often asked coding enquiry from Java interviews. The logic to solve this work is similar to what nosotros accept seen inwards how to detect duplicate words inwards a String. In the get-go mensuration you lot demand to construct a give-and-take Map past times reading contents of a text File. This Map should comprise give-and-take every bit a primal too their count every bit value. Once you lot accept this Map ready, you lot tin precisely variety the Map based upon values. If you lot don't know how to variety a Map on values, run into this tutorial first. It volition learn you lot past times sorting HashMap on values. Now getting primal too value inwards sorted should survive easy, but retrieve HashMap doesn't maintain order, thence you lot demand to role a List to proceed the entry inwards sorted order. Once you lot got this list, you lot tin precisely loop over the list too impress each primal too value from the entry. This way, you lot tin besides create a tabular array of words too their count inwards decreasing order.  This work is sometimes besides asked every bit to impress all give-and-take too their count inwards tabular format.



How to detect highest repeated give-and-take from a file

Here is the Java plan to detect the duplicate give-and-take which has occurred maximum lay out of times inwards a file. You tin besides impress frequency of give-and-take from highest to lowest because you lot accept the Map, which contains give-and-take too their count inwards sorted order. All you lot demand to create is iterate over each entry of Map and impress the keys too values.



Most of import business office of this solution is sorting all entries. Since Map.Entry doesn't implement the Comparable interface, nosotros demand to write our ain custom Comparator to variety the entries. If you lot await at my implementation, I am comparison entries on their values because that's what nosotros want. Many programmer says that why non role the LinkedHashMap class? but remember, the LinkedHashMap shape keeps the keys inwards sorted order, non the values. So you lot demand this exceptional Comparator to compare values too shop them inwards List.

Here is i approach to solve this work using map-reduce technique:

 is precisely about other often asked coding enquiry from Java interviews How to detect highest repeating give-and-take from a text File inwards Java - Word Count Problem



Java Program to Print give-and-take too their count from File

import java.io.BufferedReader; import java.io.DataInputStream; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Map.Entry; import java.util.Set; import java.util.StringTokenizer; import java.util.regex.Pattern; /**  * Java plan to detect count of repeated words inwards a file.  *  * @author  */ public class Problem {      public static void main(String args[]) {         Map<String, Integer> wordMap = buildWordMap("C:/temp/words.txt");         List<Entry<String, Integer>> listing = sortByValueInDecreasingOrder(wordMap);         System.out.println("List of repeated give-and-take from file too their count");         for (Map.Entry<String, Integer> entry : list) {             if (entry.getValue() > 1) {                 System.out.println(entry.getKey() + " => " + entry.getValue());             }         }     }      public static Map<String, Integer> buildWordMap(String fileName) {         // Using diamond operator for create clean code         Map<String, Integer> wordMap = new HashMap<>();         // Using try-with-resource contestation for automatic resources management         try (FileInputStream fis = new FileInputStream(fileName);                 DataInputStream dis = new DataInputStream(fis);                 BufferedReader br = new BufferedReader(new InputStreamReader(dis))) {             // words are separated past times whitespace             Pattern designing = Pattern.compile("\\s+");             String describe = null;             while ((line = br.readLine()) != null) {                 // create this if instance sensitivity is non required i.e. Java = java                 describe = line.toLowerCase();                 String[] words = pattern.split(line);                 for (String give-and-take : words) {                     if (wordMap.containsKey(word)) {                         wordMap.put(word, (wordMap.get(word) + 1));                     } else {                         wordMap.put(word, 1);                     }                 }             }         } catch (IOException ioex) {             ioex.printStackTrace();         }         return wordMap;     }      public static List<Entry<String, Integer>> sortByValueInDecreasingOrder(Map<String, Integer> wordMap) {         Set<Entry<String, Integer>> entries = wordMap.entrySet();         List<Entry<String, Integer>> listing = new ArrayList<>(entries);         Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {             @Override             public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {                 return (o2.getValue()).compareTo(o1.getValue());             }         });         return list;     } }  Output: List of repeated give-and-take from file too their count its => 2 of => 2 programming => 2 coffee => 2 linguistic communication => 2


Things to note

If you lot writing code on interviews brand certain they are production lineament code, which agency you lot must grip every bit many errors every bit possible, you lot must write unit of measurement tests, you lot must comment the code too you lot create proper resources management. Here are twosome of to a greater extent than points to remember:

1) Close files too streams i time you lot are through amongst it, run into this tutorial acquire correct way to closed the stream. If you lot are inwards Java 7, precisely role try-with-resource statement.

2) Since the size of the file is non specified, the interviewer may grill you lot on cases similar What happens if the file is large? With a large file, your plan volition run out of retention too throw java.lang.OutOfMemory: Java Heap space. One solution for this is to create this occupation inwards chunk e.g. get-go read 20% content, detect maximum repeated give-and-take on that, thence read adjacent 20% content too detect repeated maximum past times taking the previous maximum inwards consideration. This way, you lot don't demand to shop all words inwards retention too you lot tin procedure whatever arbitrary length file.

3) Alway role Generics for type-safety.


That's all on how to detect repeated give-and-take from a file too impress their count. You tin apply the same technique to detect duplicate words inwards a String. Since forthwith you lot accept a sorted listing of words too their count, you lot tin besides detect the maximum, minimum or repeated words which has counted to a greater extent than than the specific number.

Further Learning
The Coding Interview Bootcamp: Algorithms + Data Structures
Data Structures too Algorithms: Deep Dive Using Java
Algorithms too Data Structures - Part 1 too 2


No comments:

Post a Comment