Class CsvFormatDetector

    • Constructor Summary

      Constructors 
      Constructor Description
      CsvFormatDetector​(int maxRowSamples, CsvParserSettings settings, int whitespaceRangeStart)
      Builds a new CsvFormatDetector
    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      protected abstract void apply​(char delimiter, char quote, char quoteEscape)
      Applies the discovered CSV format elements to the CsvParser
      protected java.util.Map<java.lang.Character,​java.lang.Integer> calculateTotals​(java.util.List<java.util.Map<java.lang.Character,​java.lang.Integer>> symbolsPerRow)  
      void execute​(char[] characters, int length)
      A sequence of characters of the input buffer to be analyzed.
      protected char getChar​(java.util.Map<java.lang.Character,​java.lang.Integer> map, java.util.Map<java.lang.Character,​java.lang.Integer> totals, char defaultChar, boolean min)
      Returns the character with the highest or lowest associated number.
      protected void increment​(java.util.Map<java.lang.Character,​java.lang.Integer> map, char symbol)
      Increments the number associated with a character in a map by 1
      protected void increment​(java.util.Map<java.lang.Character,​java.lang.Integer> map, char symbol, int incrementSize)
      Increments the number associated with a character in a map
      protected boolean isAllowedDelimiter​(char ch)  
      protected boolean isSymbol​(char ch)  
      protected char max​(java.util.Map<java.lang.Character,​java.lang.Integer> map, java.util.Map<java.lang.Character,​java.lang.Integer> totals, char defaultChar)
      Returns the character with the highest associated number.
      protected char min​(java.util.Map<java.lang.Character,​java.lang.Integer> map, java.util.Map<java.lang.Character,​java.lang.Integer> totals, char defaultChar)
      Returns the character with the lowest associated number.
      protected char pickDelimiter​(java.util.Map<java.lang.Character,​java.lang.Integer> sums, java.util.Map<java.lang.Character,​java.lang.Integer> totals)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • MAX_ROW_SAMPLES

        private final int MAX_ROW_SAMPLES
      • comment

        private final char comment
      • suggestedDelimiter

        private final char suggestedDelimiter
      • normalizedNewLine

        private final char normalizedNewLine
      • whitespaceRangeStart

        private final int whitespaceRangeStart
      • allowedDelimiters

        private char[] allowedDelimiters
      • delimiterPreference

        private char[] delimiterPreference
      • suggestedQuote

        private final char suggestedQuote
      • suggestedQuoteEscape

        private final char suggestedQuoteEscape
    • Constructor Detail

      • CsvFormatDetector

        public CsvFormatDetector​(int maxRowSamples,
                                 CsvParserSettings settings,
                                 int whitespaceRangeStart)
        Builds a new CsvFormatDetector
        Parameters:
        maxRowSamples - the number of row samples to collect before analyzing the statistics
        settings - the configuration provided by the user with potential defaults in case the detection is unable to discover the proper column delimiter or quote character.
        whitespaceRangeStart - starting range of characters considered to be whitespace.
    • Method Detail

      • calculateTotals

        protected java.util.Map<java.lang.Character,​java.lang.Integer> calculateTotals​(java.util.List<java.util.Map<java.lang.Character,​java.lang.Integer>> symbolsPerRow)
      • execute

        public void execute​(char[] characters,
                            int length)
        Description copied from interface: InputAnalysisProcess
        A sequence of characters of the input buffer to be analyzed.
        Specified by:
        execute in interface InputAnalysisProcess
        Parameters:
        characters - the input buffer
        length - the last character position loaded into the buffer.
      • pickDelimiter

        protected char pickDelimiter​(java.util.Map<java.lang.Character,​java.lang.Integer> sums,
                                     java.util.Map<java.lang.Character,​java.lang.Integer> totals)
      • increment

        protected void increment​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                                 char symbol)
        Increments the number associated with a character in a map by 1
        Parameters:
        map - the map of characters and their numbers
        symbol - the character whose number should be increment
      • increment

        protected void increment​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                                 char symbol,
                                 int incrementSize)
        Increments the number associated with a character in a map
        Parameters:
        map - the map of characters and their numbers
        symbol - the character whose number should be increment
        incrementSize - the size of the increment
      • min

        protected char min​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                           java.util.Map<java.lang.Character,​java.lang.Integer> totals,
                           char defaultChar)
        Returns the character with the lowest associated number.
        Parameters:
        map - the map of characters and their numbers
        defaultChar - the default character to return in case the map is empty
        Returns:
        the character with the lowest number associated.
      • max

        protected char max​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                           java.util.Map<java.lang.Character,​java.lang.Integer> totals,
                           char defaultChar)
        Returns the character with the highest associated number.
        Parameters:
        map - the map of characters and their numbers
        defaultChar - the default character to return in case the map is empty
        Returns:
        the character with the highest number associated.
      • getChar

        protected char getChar​(java.util.Map<java.lang.Character,​java.lang.Integer> map,
                               java.util.Map<java.lang.Character,​java.lang.Integer> totals,
                               char defaultChar,
                               boolean min)
        Returns the character with the highest or lowest associated number.
        Parameters:
        map - the map of characters and their numbers
        defaultChar - the default character to return in case the map is empty
        min - a flag indicating whether to return the character associated with the lowest number in the map. If false then the character associated with the highest number found will be returned.
        Returns:
        the character with the highest/lowest number associated.
      • isSymbol

        protected boolean isSymbol​(char ch)
      • isAllowedDelimiter

        protected boolean isAllowedDelimiter​(char ch)
      • apply

        protected abstract void apply​(char delimiter,
                                      char quote,
                                      char quoteEscape)
        Applies the discovered CSV format elements to the CsvParser
        Parameters:
        delimiter - the discovered delimiter character
        quote - the discovered quote character
        quoteEscape - the discovered quote escape character.