Class CsvParser
- java.lang.Object
-
- com.univocity.parsers.common.AbstractParser<CsvParserSettings>
-
- com.univocity.parsers.csv.CsvParser
-
public final class CsvParser extends AbstractParser<CsvParserSettings>
A very fast CSV parser implementation.- See Also:
CsvFormat
,CsvParserSettings
,CsvWriter
,AbstractParser
-
-
Field Summary
Fields Modifier and Type Field Description private boolean
backToDelimiter
private char
delimiter
private char[]
delimiters
private boolean
doNotEscapeUnquotedValues
private java.lang.String
emptyValue
private char
escapeEscape
private int
formatDetectorRowSampleCount
private boolean
keepEscape
private boolean
keepQuotes
private int
match
private int
maxColumnLength
private char[]
multiDelimiter
private char
newLine
private boolean
normalizeLineEndingsInQuotes
private java.lang.String
nullValue
private boolean
parseUnescapedQuotes
private boolean
parseUnescapedQuotesUntilDelimiter
private char
prev
private char
quote
private char
quoteEscape
private UnescapedQuoteHandling
quoteHandling
private boolean
trimQuotedLeading
private boolean
trimQuotedTrailing
private boolean
unescaped
private DefaultCharAppender
whitespaceAppender
-
Fields inherited from class com.univocity.parsers.common.AbstractParser
ch, comment, comments, context, ignoreLeadingWhitespace, ignoreTrailingWhitespace, input, lastComment, output, processor, settings, whitespaceRangeStart
-
-
Constructor Summary
Constructors Constructor Description CsvParser(CsvParserSettings settings)
The CsvParser supports all settings provided byCsvParserSettings
, and requires this configuration to be properly initialized.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
appendUntilMultiDelimiter()
protected boolean
consumeValueOnEOF()
Allows the parser implementation to handle any value that was being consumed when the end of the input was reachedCsvFormat
getDetectedFormat()
Returns the CSV format detected when one of the following settings is enabled:CommonParserSettings.isLineSeparatorDetectionEnabled()
CsvParserSettings.isDelimiterDetectionEnabled()
CsvParserSettings.isQuoteDetectionEnabled()
The detected format will be available once the parsing process is initialized (i.e.protected InputAnalysisProcess
getInputAnalysisProcess()
Allows the parser implementation to traverse the input buffer before the parsing process starts, in order to enable automatic configuration and discovery of data formats.private boolean
handleUnescapedQuote()
private void
handleUnescapedQuoteInValue()
private void
handleValueSkipping(boolean quoted)
private boolean
matchDelimiter()
private boolean
matchDelimiterAfterQuote()
private int
nextDelimiter()
private void
parseMultiDelimiterRecord()
private void
parseQuotedValue()
private void
parseQuotedValueMultiDelimiter()
protected void
parseRecord()
Parser-specific implementation for reading a single record from the input.private void
parseSingleDelimiterRecord()
private void
parseValueProcessingEscape()
private void
parseValueProcessingEscapeMultiDelimiter()
private void
processQuoteEscape()
private void
saveMatchingCharacters()
private void
skipValue()
private void
skipWhitespace()
void
updateFormat(CsvFormat format)
Allows changing the format of the input on the fly.-
Methods inherited from class com.univocity.parsers.common.AbstractParser
beginParsing, beginParsing, beginParsing, beginParsing, beginParsing, beginParsing, beginParsing, createParsingContext, getContext, getRecordMetadata, inComment, initialize, iterate, iterate, iterate, iterate, iterate, iterate, iterate, iterateRecords, iterateRecords, iterateRecords, iterateRecords, iterateRecords, iterateRecords, iterateRecords, parse, parse, parse, parse, parse, parse, parse, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAll, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseAllRecords, parseLine, parseNext, parseNextRecord, parseRecord, processComment, reloadHeaders, stopParsing
-
-
-
-
Field Detail
-
parseUnescapedQuotes
private boolean parseUnescapedQuotes
-
parseUnescapedQuotesUntilDelimiter
private boolean parseUnescapedQuotesUntilDelimiter
-
backToDelimiter
private boolean backToDelimiter
-
doNotEscapeUnquotedValues
private final boolean doNotEscapeUnquotedValues
-
keepEscape
private final boolean keepEscape
-
keepQuotes
private final boolean keepQuotes
-
unescaped
private boolean unescaped
-
prev
private char prev
-
delimiter
private char delimiter
-
multiDelimiter
private char[] multiDelimiter
-
quote
private char quote
-
quoteEscape
private char quoteEscape
-
escapeEscape
private char escapeEscape
-
newLine
private char newLine
-
whitespaceAppender
private final DefaultCharAppender whitespaceAppender
-
normalizeLineEndingsInQuotes
private final boolean normalizeLineEndingsInQuotes
-
quoteHandling
private UnescapedQuoteHandling quoteHandling
-
nullValue
private final java.lang.String nullValue
-
maxColumnLength
private final int maxColumnLength
-
emptyValue
private final java.lang.String emptyValue
-
trimQuotedLeading
private final boolean trimQuotedLeading
-
trimQuotedTrailing
private final boolean trimQuotedTrailing
-
delimiters
private char[] delimiters
-
match
private int match
-
formatDetectorRowSampleCount
private int formatDetectorRowSampleCount
-
-
Constructor Detail
-
CsvParser
public CsvParser(CsvParserSettings settings)
The CsvParser supports all settings provided byCsvParserSettings
, and requires this configuration to be properly initialized.- Parameters:
settings
- the parser configuration
-
-
Method Detail
-
parseRecord
protected final void parseRecord()
Description copied from class:AbstractParser
Parser-specific implementation for reading a single record from the input.The AbstractParser handles the initialization and processing of the input until it is ready to be parsed.
It then delegates the input to the parser-specific implementation defined by
AbstractParser.parseRecord()
. In general, an implementation ofAbstractParser.parseRecord()
will perform the following steps:- Test the character stored in ch and take some action on it (e.g. is while (ch != '\n'){doSomething()})
- Request more characters by calling ch = input.nextChar();
- Append the desired characters to the output by executing, for example, output.appender.append(ch)
- Notify a value of the record has been fully read by executing output.valueParsed(). This will clear the output appender (
CharAppender
) so the next call to output.appender.append(ch) will be store the character of the next parsed value - Rinse and repeat until all values of the record are parsed
Once the
AbstractParser.parseRecord()
returns, the AbstractParser takes over and handles the information (generally, reorganizing it and passing it on to aRowProcessor
).After the record processing, the AbstractParser reads the next characters from the input, delegating control again to the parseRecord() implementation for processing of the next record.
This cycle repeats until the reading process is stopped by the user, the input is exhausted, or an error happens.
In case of errors, the unchecked exception
TextParsingException
will be thrown and all resources in use will be closed automatically unlessCommonParserSettings.isAutoClosingEnabled()
evaluates tofalse
. The exception should contain the cause and more information about where in the input the error happened.- Specified by:
parseRecord
in classAbstractParser<CsvParserSettings>
- See Also:
CharInputReader
,CharAppender
,ParserOutput
,TextParsingException
,RowProcessor
-
parseSingleDelimiterRecord
private final void parseSingleDelimiterRecord()
-
skipValue
private void skipValue()
-
handleValueSkipping
private void handleValueSkipping(boolean quoted)
-
handleUnescapedQuoteInValue
private void handleUnescapedQuoteInValue()
-
nextDelimiter
private int nextDelimiter()
-
handleUnescapedQuote
private boolean handleUnescapedQuote()
-
processQuoteEscape
private void processQuoteEscape()
-
parseValueProcessingEscape
private void parseValueProcessingEscape()
-
parseQuotedValue
private void parseQuotedValue()
-
getInputAnalysisProcess
protected final InputAnalysisProcess getInputAnalysisProcess()
Description copied from class:AbstractParser
Allows the parser implementation to traverse the input buffer before the parsing process starts, in order to enable automatic configuration and discovery of data formats.- Overrides:
getInputAnalysisProcess
in classAbstractParser<CsvParserSettings>
- Returns:
- a custom implementation of
InputAnalysisProcess
. By default,null
is returned and no special input analysis will be performed.
-
getDetectedFormat
public final CsvFormat getDetectedFormat()
Returns the CSV format detected when one of the following settings is enabled:CommonParserSettings.isLineSeparatorDetectionEnabled()
CsvParserSettings.isDelimiterDetectionEnabled()
CsvParserSettings.isQuoteDetectionEnabled()
runs
.- Returns:
- the detected CSV format, or
null
if no detection has been enabled or if the parsing process has not been started yet.
-
consumeValueOnEOF
protected final boolean consumeValueOnEOF()
Description copied from class:AbstractParser
Allows the parser implementation to handle any value that was being consumed when the end of the input was reached- Overrides:
consumeValueOnEOF
in classAbstractParser<CsvParserSettings>
- Returns:
- a flag indicating whether the parser was processing a value when the end of the input was reached.
-
updateFormat
public final void updateFormat(CsvFormat format)
Allows changing the format of the input on the fly.- Parameters:
format
- the new format to use.
-
skipWhitespace
private void skipWhitespace()
-
saveMatchingCharacters
private void saveMatchingCharacters()
-
matchDelimiter
private boolean matchDelimiter()
-
matchDelimiterAfterQuote
private boolean matchDelimiterAfterQuote()
-
parseMultiDelimiterRecord
private void parseMultiDelimiterRecord()
-
appendUntilMultiDelimiter
private void appendUntilMultiDelimiter()
-
parseQuotedValueMultiDelimiter
private void parseQuotedValueMultiDelimiter()
-
parseValueProcessingEscapeMultiDelimiter
private void parseValueProcessingEscapeMultiDelimiter()
-
-