Class FixedWidthParserSettings
- java.lang.Object
-
- com.univocity.parsers.common.CommonSettings<F>
-
- com.univocity.parsers.common.CommonParserSettings<FixedWidthFormat>
-
- com.univocity.parsers.fixed.FixedWidthParserSettings
-
- All Implemented Interfaces:
java.lang.Cloneable
public class FixedWidthParserSettings extends CommonParserSettings<FixedWidthFormat>
This is the configuration class used by the Fixed-Width parser (FixedWidthParser
)In addition to the configuration options provided by
CommonParserSettings
, the FixedWidthParserSettings include:- skipTrailingCharsUntilNewline (defaults to
false
): Indicates whether or not any trailing characters beyond the record's length should be skipped until the newline is reachedFor example, if the record length is 5, but the row contains "12345678\n", then portion containing "678" will be discarded and not considered part of the next record
- recordEndsOnNewline (defaults to
false
): Indicates whether or not a record is considered parsed when a newline is reached.For example, if recordEndsOnNewline is set to true, then given a record of length 4, and the input "12\n3456", the parser will identify [12] and [3456]
If recordEndsOnNewline is set to false, then given a record of length 4, and the input "12\n3456", the parser will identify a multi-line record [12\n3] and [456 ]
The FixedWidthParserSettings need a definition of the field lengths of each record in the input. This must provided using an instance of
FixedWidthFields
.
-
-
Field Summary
Fields Modifier and Type Field Description private FixedWidthFields
fieldLengths
private boolean
keepPadding
private java.util.Map<java.lang.String,FixedWidthFields>
lookaheadFormats
private java.util.Map<java.lang.String,FixedWidthFields>
lookbehindFormats
protected boolean
recordEndsOnNewline
protected boolean
skipTrailingCharsUntilNewline
private boolean
useDefaultPaddingForHeaders
-
Fields inherited from class com.univocity.parsers.common.CommonParserSettings
headerExtractionEnabled
-
-
Constructor Summary
Constructors Constructor Description FixedWidthParserSettings()
Creates a basic configuration object for the Fixed-Width parser with no field length configuration.FixedWidthParserSettings(FixedWidthFields fieldLengths)
You can only create an instance of this class by providing a definition of the field lengths of each record in the input.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected void
addConfiguration(java.util.Map<java.lang.String,java.lang.Object> out)
void
addFormatForLookahead(java.lang.String lookahead, FixedWidthFields lengths)
Defines the format of records identified by a lookahead symbol.void
addFormatForLookbehind(java.lang.String lookbehind, FixedWidthFields lengths)
Defines the format of records identified by a lookbehind symbol.private int[]
calculateMaxFieldLengths()
FixedWidthParserSettings
clone()
Clones this configuration object to reuse all user-provided settings, including the fixed-width field configuration.protected FixedWidthParserSettings
clone(boolean clearInputSpecificSettings)
Deprecated.doesn't really make sense for fixed-width.private FixedWidthParserSettings
clone(boolean clearInputSpecificSettings, FixedWidthFields fields)
FixedWidthParserSettings
clone(FixedWidthFields fields)
Clones this configuration object to reuse most user-provided settings.protected void
configureFromAnnotations(java.lang.Class<?> beanClass)
Configures the parser based on the annotations provided in a given classprotected FixedWidthFormat
createDefaultFormat()
Returns the default FixedWidthFormat configured to handle Fixed-Width inputs(package private) int[]
getAllLengths()
(package private) FieldAlignment[]
getFieldAlignments()
Returns the sequence of alignments to consider for each field of each record.(package private) int[]
getFieldLengths()
Returns the sequence of lengths to be read by the parser to form a record.(package private) char[]
getFieldPaddings()
Returns the sequence of paddings used by each field of each record.(package private) boolean[]
getFieldsToIgnore()
Returns the sequence of fields to ignore.boolean
getKeepPadding()
Indicate the padding character should be kept in the parsed value (defaults tofalse
) This setting can be overridden for individual fields throughFixedWidthFields.stripPaddingFrom(String, String...)
andFixedWidthFields.keepPaddingOn(String, String...)
(package private) java.lang.Boolean[]
getKeepPaddingFlags()
Returns the sequence of fields whose padding character must/must not be retained in the parsed value(package private) Lookup[]
getLookaheadFormats()
(package private) Lookup[]
getLookbehindFormats()
int
getMaxCharsPerColumn()
The maximum number of characters allowed for any given value being written/read.int
getMaxColumns()
Returns the hard limit of how many columns a record can have (defaults to a maximum of 512).boolean
getRecordEndsOnNewline()
Indicates whether or not a record is considered parsed when a newline is reached.boolean
getSkipTrailingCharsUntilNewline()
Indicates whether or not any trailing characters beyond the record's length should be skipped until the newline is reached (defaults tofalse
)boolean
getUseDefaultPaddingForHeaders()
Indicates whether headers should be parsed using the default padding specified inFixedWidthFormat.getPadding()
instead of any custom padding associated with a given field (inFixedWidthFields.setPadding(char, int...)
) Defaults totrue
protected CharAppender
newCharAppender()
Returns an instance of CharAppender with the configured limit of maximum characters per column and, default value used to represent a null value (when the String parsed from the input is empty), and the padding character to handle unwritten positionsvoid
setKeepPadding(boolean keepPadding)
Configures the fixed-width parser to retain the padding character in any parsed values (defaults tofalse
) This setting can be overridden for individual fields throughFixedWidthFields.stripPaddingFrom(String, String...)
andFixedWidthFields.keepPaddingOn(String, String...)
void
setRecordEndsOnNewline(boolean recordEndsOnNewline)
Defines whether or not a record is considered parsed when a newline is reached.void
setSkipTrailingCharsUntilNewline(boolean skipTrailingCharsUntilNewline)
Defines whether or not any trailing characters beyond the record's length should be skipped until the newline is reached (defaults tofalse
)void
setUseDefaultPaddingForHeaders(boolean useDefaultPaddingForHeaders)
Defines whether headers should be parsed using the default padding specified inFixedWidthFormat.getPadding()
instead of any custom padding associated with a given field (inFixedWidthFields.setPadding(char, int...)
)-
Methods inherited from class com.univocity.parsers.common.CommonParserSettings
addInputAnalysisProcess, clearInputSpecificSettings, getInputAnalysisProcesses, getInputBufferSize, getNumberOfRecordsToRead, getNumberOfRowsToSkip, getProcessor, getReadInputOnSeparateThread, getRowProcessor, isAutoClosingEnabled, isColumnReorderingEnabled, isCommentCollectionEnabled, isCommentProcessingEnabled, isHeaderExtractionEnabled, isLineSeparatorDetectionEnabled, newCharInputReader, setAutoClosingEnabled, setColumnReorderingEnabled, setCommentCollectionEnabled, setCommentProcessingEnabled, setHeaderExtractionEnabled, setInputBufferSize, setLineSeparatorDetectionEnabled, setNumberOfRecordsToRead, setNumberOfRowsToSkip, setProcessor, setReadInputOnSeparateThread, setRowProcessor
-
Methods inherited from class com.univocity.parsers.common.CommonSettings
excludeFields, excludeFields, excludeIndexes, getErrorContentLength, getFormat, getHeaders, getIgnoreLeadingWhitespaces, getIgnoreTrailingWhitespaces, getNullValue, getProcessorErrorHandler, getRowProcessorErrorHandler, getSkipBitsAsWhitespace, getSkipEmptyLines, getWhitespaceRangeStart, isAutoConfigurationEnabled, isProcessorErrorHandlerDefined, selectFields, selectFields, selectIndexes, setAutoConfigurationEnabled, setErrorContentLength, setFormat, setHeaders, setIgnoreLeadingWhitespaces, setIgnoreTrailingWhitespaces, setMaxCharsPerColumn, setMaxColumns, setNullValue, setProcessorErrorHandler, setRowProcessorErrorHandler, setSkipBitsAsWhitespace, setSkipEmptyLines, toString, trimValues
-
-
-
-
Field Detail
-
skipTrailingCharsUntilNewline
protected boolean skipTrailingCharsUntilNewline
-
recordEndsOnNewline
protected boolean recordEndsOnNewline
-
useDefaultPaddingForHeaders
private boolean useDefaultPaddingForHeaders
-
keepPadding
private boolean keepPadding
-
fieldLengths
private FixedWidthFields fieldLengths
-
lookaheadFormats
private java.util.Map<java.lang.String,FixedWidthFields> lookaheadFormats
-
lookbehindFormats
private java.util.Map<java.lang.String,FixedWidthFields> lookbehindFormats
-
-
Constructor Detail
-
FixedWidthParserSettings
public FixedWidthParserSettings(FixedWidthFields fieldLengths)
You can only create an instance of this class by providing a definition of the field lengths of each record in the input.This must provided using an instance of
FixedWidthFields
.- Parameters:
fieldLengths
- the instance ofFixedWidthFields
which provides the lengths of each field in the fixed-width records to be parsed- See Also:
FixedWidthFields
-
FixedWidthParserSettings
public FixedWidthParserSettings()
Creates a basic configuration object for the Fixed-Width parser with no field length configuration. This constructor is intended to be used when the record length varies depending of the input row. Refer toaddFormatForLookahead(String, FixedWidthFields)
,addFormatForLookbehind(String, FixedWidthFields)
-
-
Method Detail
-
getFieldLengths
int[] getFieldLengths()
Returns the sequence of lengths to be read by the parser to form a record.- Returns:
- the sequence of lengths to be read by the parser to form a record.
-
getAllLengths
int[] getAllLengths()
-
getFieldPaddings
char[] getFieldPaddings()
Returns the sequence of paddings used by each field of each record.- Returns:
- the sequence of paddings used by each field of each record.
-
getFieldsToIgnore
boolean[] getFieldsToIgnore()
Returns the sequence of fields to ignore.- Returns:
- the sequence of fields to ignore.
-
getFieldAlignments
FieldAlignment[] getFieldAlignments()
Returns the sequence of alignments to consider for each field of each record.- Returns:
- the sequence of alignments to consider for each field of each record.
-
getSkipTrailingCharsUntilNewline
public boolean getSkipTrailingCharsUntilNewline()
Indicates whether or not any trailing characters beyond the record's length should be skipped until the newline is reached (defaults tofalse
)For example, if the record length is 5, but the row contains "12345678\n", then the portion containing "678\n" will be discarded and not considered part of the next record
- Returns:
- returns true if any trailing characters beyond the record's length should be skipped until the newline is reached, false otherwise
-
setSkipTrailingCharsUntilNewline
public void setSkipTrailingCharsUntilNewline(boolean skipTrailingCharsUntilNewline)
Defines whether or not any trailing characters beyond the record's length should be skipped until the newline is reached (defaults tofalse
)For example, if the record length is 5, but the row contains "12345678\n", then the portion containing "678\n" will be discarded and not considered part of the next record
- Parameters:
skipTrailingCharsUntilNewline
- a flag indicating if any trailing characters beyond the record's length should be skipped until the newline is reached
-
getRecordEndsOnNewline
public boolean getRecordEndsOnNewline()
Indicates whether or not a record is considered parsed when a newline is reached. Examples:- Consider two records of length 4, and the input 12\n3456
- When
recordEndsOnNewline
is set to true: the first value will be read as 12 and the second 3456 - When
recordEndsOnNewline
is set to false: the first value will be read as 12\n3 and the second 456
defaults to
false
- Returns:
- true if a record should be considered parsed when a newline is reached; false otherwise
-
setRecordEndsOnNewline
public void setRecordEndsOnNewline(boolean recordEndsOnNewline)
Defines whether or not a record is considered parsed when a newline is reached. Examples:- Consider two records of length 4, and the input 12\n3456
- When
recordEndsOnNewline
is set to true: the first value will be read as 12 and the second 3456 - When
recordEndsOnNewline
is set to false: the first value will be read as 12\n3 and the second 456
- Parameters:
recordEndsOnNewline
- a flag indicating whether or not a record is considered parsed when a newline is reached
-
createDefaultFormat
protected FixedWidthFormat createDefaultFormat()
Returns the default FixedWidthFormat configured to handle Fixed-Width inputs- Specified by:
createDefaultFormat
in classCommonSettings<FixedWidthFormat>
- Returns:
- and instance of FixedWidthFormat configured to handle Fixed-Width inputs
-
newCharAppender
protected CharAppender newCharAppender()
Returns an instance of CharAppender with the configured limit of maximum characters per column and, default value used to represent a null value (when the String parsed from the input is empty), and the padding character to handle unwritten positionsThis overrides the parent implementation to create a CharAppender capable of handling padding characters that represent unwritten positions.
- Overrides:
newCharAppender
in classCommonParserSettings<FixedWidthFormat>
- Returns:
- an instance of CharAppender with the configured limit of maximum characters per column and, default value used to represent a null value (when the String parsed from the input is empty), and the padding character to handle unwritten positions
-
getMaxCharsPerColumn
public int getMaxCharsPerColumn()
The maximum number of characters allowed for any given value being written/read. Used to avoid OutOfMemoryErrors (defaults to a minimum of 4096 characters).This overrides the parent implementation and calculates the absolute minimum number of characters required to store the values of a record
If the sum of all field lengths is greater than the configured maximum number of characters per column, the calculated amount will be returned.
- Overrides:
getMaxCharsPerColumn
in classCommonSettings<FixedWidthFormat>
- Returns:
- The maximum number of characters allowed for any given value being written/read
-
getMaxColumns
public int getMaxColumns()
Returns the hard limit of how many columns a record can have (defaults to a maximum of 512). You need this to avoid OutOfMemory errors in case of inputs that might be inconsistent with the format you are dealing with.This overrides the parent implementation and calculates the absolute minimum number of columns required to store the values of a record
If the sum of all fields is greater than the configured maximum number columns, the calculated amount will be returned.
- Overrides:
getMaxColumns
in classCommonSettings<FixedWidthFormat>
- Returns:
- The maximum number of columns a record can have.
-
calculateMaxFieldLengths
private int[] calculateMaxFieldLengths()
-
getLookaheadFormats
Lookup[] getLookaheadFormats()
-
getLookbehindFormats
Lookup[] getLookbehindFormats()
-
addFormatForLookahead
public void addFormatForLookahead(java.lang.String lookahead, FixedWidthFields lengths)
Defines the format of records identified by a lookahead symbol.- Parameters:
lookahead
- the lookahead value that when found in the input, will notify the parser to switch to a new record format, with different field lengthslengths
- the field lengths of the record format identified by the given lookahead symbol.
-
addFormatForLookbehind
public void addFormatForLookbehind(java.lang.String lookbehind, FixedWidthFields lengths)
Defines the format of records identified by a lookbehind symbol.- Parameters:
lookbehind
- the lookbehind value that when found in the previous input row, will notify the parser to switch to a new record format, with different field lengthslengths
- the field lengths of the record format identified by the given lookbehind symbol.
-
getUseDefaultPaddingForHeaders
public boolean getUseDefaultPaddingForHeaders()
Indicates whether headers should be parsed using the default padding specified inFixedWidthFormat.getPadding()
instead of any custom padding associated with a given field (inFixedWidthFields.setPadding(char, int...)
) Defaults totrue
- Returns:
true
if the default padding is to be used when reading headers, otherwisefalse
-
setUseDefaultPaddingForHeaders
public void setUseDefaultPaddingForHeaders(boolean useDefaultPaddingForHeaders)
Defines whether headers should be parsed using the default padding specified inFixedWidthFormat.getPadding()
instead of any custom padding associated with a given field (inFixedWidthFields.setPadding(char, int...)
)- Parameters:
useDefaultPaddingForHeaders
- flag indicating whether the default padding is to be used when parsing headers
-
configureFromAnnotations
protected void configureFromAnnotations(java.lang.Class<?> beanClass)
Description copied from class:CommonParserSettings
Configures the parser based on the annotations provided in a given class- Overrides:
configureFromAnnotations
in classCommonParserSettings<FixedWidthFormat>
- Parameters:
beanClass
- the classes whose annotations will be processed to derive configurations for parsing
-
addConfiguration
protected void addConfiguration(java.util.Map<java.lang.String,java.lang.Object> out)
- Overrides:
addConfiguration
in classCommonParserSettings<FixedWidthFormat>
-
clone
public final FixedWidthParserSettings clone()
Clones this configuration object to reuse all user-provided settings, including the fixed-width field configuration.- Overrides:
clone
in classCommonParserSettings<FixedWidthFormat>
- Returns:
- a copy of all configurations applied to the current instance.
-
clone
@Deprecated protected final FixedWidthParserSettings clone(boolean clearInputSpecificSettings)
Deprecated.doesn't really make sense for fixed-width. Use alternative methodclone(FixedWidthFields)
.Clones this configuration object to reuse most user-provided settings. This includes the fixed-width field configuration, but doesn't include other input-specific settings. This method is meant to be used internally only.- Overrides:
clone
in classCommonParserSettings<FixedWidthFormat>
- Parameters:
clearInputSpecificSettings
- flag indicating whether to clear settings that are likely to be associated with a given input.- Returns:
- a copy of all configurations applied to the current instance.
-
clone
public final FixedWidthParserSettings clone(FixedWidthFields fields)
Clones this configuration object to reuse most user-provided settings. Properties that are specific to a given input (such as header names and selection of fields) will be reset to their defaults. To obtain a full copy, useclone()
.- Parameters:
fields
- the fixed-width field configuration to be used by the cloned settings object.- Returns:
- a copy of the general configurations applied to the current instance.
-
clone
private FixedWidthParserSettings clone(boolean clearInputSpecificSettings, FixedWidthFields fields)
-
getKeepPadding
public final boolean getKeepPadding()
Indicate the padding character should be kept in the parsed value (defaults tofalse
) This setting can be overridden for individual fields throughFixedWidthFields.stripPaddingFrom(String, String...)
andFixedWidthFields.keepPaddingOn(String, String...)
- Returns:
- flag indicating the padding character should be kept in the parsed value
-
setKeepPadding
public final void setKeepPadding(boolean keepPadding)
Configures the fixed-width parser to retain the padding character in any parsed values (defaults tofalse
) This setting can be overridden for individual fields throughFixedWidthFields.stripPaddingFrom(String, String...)
andFixedWidthFields.keepPaddingOn(String, String...)
- Parameters:
keepPadding
- flag indicating the padding character should be kept in the parsed value
-
getKeepPaddingFlags
java.lang.Boolean[] getKeepPaddingFlags()
Returns the sequence of fields whose padding character must/must not be retained in the parsed value- Returns:
- the sequence that have an explicit 'keepPadding' flag.
-
-