Package com.univocity.parsers.common
Class NormalizedString
- java.lang.Object
-
- com.univocity.parsers.common.NormalizedString
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.CharSequence
,java.lang.Comparable<NormalizedString>
public final class NormalizedString extends java.lang.Object implements java.io.Serializable, java.lang.Comparable<NormalizedString>, java.lang.CharSequence
ANormalizedString
allows representing text in a normalized fashion. Strings with different character case or surrounding whitespace are considered the same. Used to represent groups of fields, where users may refer to their names using different character cases or whitespaces. Where the character case or the surrounding space is relevant, theNormalizedString
will have itsisLiteral()
method returntrue
, meaning the exact character case and surrounding whitespaces are required for matching it. InvokingvalueOf(String)
with aString
surrounded by single quotes will create a literalNormalizedString
. UseliteralValueOf(String)
to obtain the sameNormalizedString
without having to introduce single quotes.- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description private int
hashCode
private boolean
literal
private java.lang.String
normalized
private java.lang.String
original
private static long
serialVersionUID
private static StringCache<NormalizedString>
stringCache
-
Constructor Summary
Constructors Modifier Constructor Description private
NormalizedString(java.lang.String string)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description char
charAt(int index)
int
compareTo(NormalizedString o)
int
compareTo(java.lang.String o)
Compares aNormalizedString
against aString
lexicographically.boolean
equals(java.lang.Object anObject)
static StringCache<NormalizedString>
getCache()
Returns the internal string cache to allow users to tweak its size limit or clear it when appropriateprivate static <T extends java.util.Collection<java.lang.String>>
TgetCollection(T out, NormalizedString... args)
private static <T extends java.util.Collection<NormalizedString>>
TgetCollection(T out, java.lang.String... args)
private static <T extends java.util.Collection<NormalizedString>>
TgetCollection(T out, java.util.Collection<java.lang.String> args)
private static <T extends java.util.Collection<java.lang.String>>
TgetStringCollection(T out, java.util.Collection<NormalizedString> args)
int
hashCode()
static boolean
identifyLiterals(NormalizedString[] strings)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes.static boolean
identifyLiterals(NormalizedString[] strings, boolean lowercaseIdentifiers, boolean uppercaseIdentifiers)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes.boolean
isLiteral()
int
length()
static NormalizedString
literalValueOf(java.lang.String string)
Creates a literalNormalizedString
, meaning it will only match with otherString
orNormalizedString
if they have the exact same content including character case and surrounding whitespaces.private java.lang.String
normalize(java.lang.Object value)
private static boolean
shouldBeLiteral(java.lang.String string, boolean lowercaseIdentifiers, boolean uppercaseIdentifiers)
java.lang.CharSequence
subSequence(int start, int end)
static java.lang.String[]
toArray(NormalizedString... args)
Converts multiple normalized strings into an array ofString
.static NormalizedString[]
toArray(java.lang.String... args)
Converts multiple plain strings into an array ofNormalizedString
.static NormalizedString[]
toArray(java.util.Collection<java.lang.String> args)
Converts a collection of plain strings into an array ofNormalizedString
static java.util.ArrayList<NormalizedString>
toArrayList(java.lang.String... args)
Converts multiple plain strings into anArrayList
ofNormalizedString
.static java.util.ArrayList<NormalizedString>
toArrayList(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into anArrayList
ofNormalizedString
.static java.util.ArrayList<java.lang.String>
toArrayListOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSet
ofString
.static java.util.ArrayList<java.lang.String>
toArrayListOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSet
ofString
.static java.util.HashSet<NormalizedString>
toHashSet(java.lang.String... args)
Converts multiple plain strings into aHashSet
ofNormalizedString
.static java.util.HashSet<NormalizedString>
toHashSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aHashSet
ofNormalizedString
.static java.util.HashSet<java.lang.String>
toHashSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSet
ofString
.static java.util.HashSet<java.lang.String>
toHashSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSet
ofString
.static NormalizedString[]
toIdentifierGroupArray(NormalizedString[] strings)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes.static NormalizedString[]
toIdentifierGroupArray(java.lang.String[] strings)
Analyzes a group of String to identify any instances whose normalized content will generate clashes.static java.util.LinkedHashSet<NormalizedString>
toLinkedHashSet(java.lang.String... args)
Converts multiple plain strings into aLinkedHashSet
ofNormalizedString
.static java.util.LinkedHashSet<NormalizedString>
toLinkedHashSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aLinkedHashSet
ofNormalizedString
.static java.util.LinkedHashSet<java.lang.String>
toLinkedHashSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aLinkedHashSet
ofString
.static java.util.LinkedHashSet<java.lang.String>
toLinkedHashSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aLinkedHashSet
ofString
.NormalizedString
toLiteral()
Returns the literal representation of thisNormalizedString
, meaning it will only match with otherString
orNormalizedString
if they have the exact same content including character case and surrounding whitespaces.java.lang.String
toString()
static java.lang.String[]
toStringArray(java.util.Collection<NormalizedString> args)
Converts a collection of normalized strings into an array ofString
static java.util.TreeSet<NormalizedString>
toTreeSet(java.lang.String... args)
Converts multiple plain strings into aTreeSet
ofNormalizedString
.static java.util.TreeSet<NormalizedString>
toTreeSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aTreeSet
ofNormalizedString
.static java.util.TreeSet<java.lang.String>
toTreeSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSet
ofString
.static java.util.TreeSet<java.lang.String>
toTreeSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSet
ofString
.static NormalizedString[]
toUniqueArray(java.lang.String... args)
Converts multiple plain strings into an array ofNormalizedString
, ensuring no duplicateNormalizedString
elements exist, even if their originalString
s are different.static java.lang.String
valueOf(NormalizedString string)
Converts aNormalizedString
back to its originalString
representationstatic NormalizedString
valueOf(java.lang.Object o)
Creates a non-literalNormalizedString
, meaning it will match with otherString
orNormalizedString
regardless of different including character case and surrounding whitespaces.static NormalizedString
valueOf(java.lang.String string)
Creates a non-literalNormalizedString
, meaning it will match with otherString
orNormalizedString
regardless of different including character case and surrounding whitespaces.
-
-
-
Field Detail
-
serialVersionUID
private static final long serialVersionUID
- See Also:
- Constant Field Values
-
stringCache
private static final StringCache<NormalizedString> stringCache
-
original
private final java.lang.String original
-
normalized
private final java.lang.String normalized
-
literal
private final boolean literal
-
hashCode
private final int hashCode
-
-
Method Detail
-
normalize
private java.lang.String normalize(java.lang.Object value)
-
isLiteral
public boolean isLiteral()
-
equals
public boolean equals(java.lang.Object anObject)
- Overrides:
equals
in classjava.lang.Object
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
length
public int length()
- Specified by:
length
in interfacejava.lang.CharSequence
-
charAt
public char charAt(int index)
- Specified by:
charAt
in interfacejava.lang.CharSequence
-
subSequence
public java.lang.CharSequence subSequence(int start, int end)
- Specified by:
subSequence
in interfacejava.lang.CharSequence
-
compareTo
public int compareTo(NormalizedString o)
- Specified by:
compareTo
in interfacejava.lang.Comparable<NormalizedString>
-
compareTo
public int compareTo(java.lang.String o)
Compares aNormalizedString
against aString
lexicographically.- Parameters:
o
- a plainString
- Returns:
- the result of
String.compareTo(String)
. If thisNormalizedString
is a literal, the original argument string will be compared. If thisNormalizedString
is not a literal, the result will be from the comparison of the normalized content of both strings (i.e. surrounding whitespaces and character case differences will be ignored).
-
toString
public java.lang.String toString()
- Specified by:
toString
in interfacejava.lang.CharSequence
- Overrides:
toString
in classjava.lang.Object
-
literalValueOf
public static NormalizedString literalValueOf(java.lang.String string)
Creates a literalNormalizedString
, meaning it will only match with otherString
orNormalizedString
if they have the exact same content including character case and surrounding whitespaces.- Parameters:
string
- the inputString
- Returns:
- the literal
NormalizedString
version of the given string.
-
valueOf
public static NormalizedString valueOf(java.lang.Object o)
Creates a non-literalNormalizedString
, meaning it will match with otherString
orNormalizedString
regardless of different including character case and surrounding whitespaces. If the input value is enclosed with single quotes, a literalNormalizedString
will be returned, as described inliteralValueOf(String)
- Parameters:
o
- the input object whoseString
representation will be used- Returns:
- the
NormalizedString
of the given object.
-
valueOf
public static NormalizedString valueOf(java.lang.String string)
Creates a non-literalNormalizedString
, meaning it will match with otherString
orNormalizedString
regardless of different including character case and surrounding whitespaces. If the input string is enclosed with single quotes, a literalNormalizedString
will be returned, as described inliteralValueOf(String)
- Parameters:
string
- the input string- Returns:
- the
NormalizedString
of the given string.
-
valueOf
public static java.lang.String valueOf(NormalizedString string)
Converts aNormalizedString
back to its originalString
representation- Parameters:
string
- the normalized string- Returns:
- the original string used to create the given normalized representation.
-
toArray
public static NormalizedString[] toArray(java.util.Collection<java.lang.String> args)
Converts a collection of plain strings into an array ofNormalizedString
- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toStringArray
public static java.lang.String[] toStringArray(java.util.Collection<NormalizedString> args)
Converts a collection of normalized strings into an array ofString
- Parameters:
args
- the normalized strings to convert back to toString
- Returns:
- the
String
representations of all normalized strings.
-
toUniqueArray
public static NormalizedString[] toUniqueArray(java.lang.String... args)
Converts multiple plain strings into an array ofNormalizedString
, ensuring no duplicateNormalizedString
elements exist, even if their originalString
s are different.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toArray
public static NormalizedString[] toArray(java.lang.String... args)
Converts multiple plain strings into an array ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toArray
public static java.lang.String[] toArray(NormalizedString... args)
Converts multiple normalized strings into an array ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the
String
representations of all input strings.
-
getCollection
private static <T extends java.util.Collection<NormalizedString>> T getCollection(T out, java.lang.String... args)
-
getCollection
private static <T extends java.util.Collection<NormalizedString>> T getCollection(T out, java.util.Collection<java.lang.String> args)
-
getCollection
private static <T extends java.util.Collection<java.lang.String>> T getCollection(T out, NormalizedString... args)
-
getStringCollection
private static <T extends java.util.Collection<java.lang.String>> T getStringCollection(T out, java.util.Collection<NormalizedString> args)
-
toArrayList
public static java.util.ArrayList<NormalizedString> toArrayList(java.lang.String... args)
Converts multiple plain strings into anArrayList
ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toArrayList
public static java.util.ArrayList<NormalizedString> toArrayList(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into anArrayList
ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toArrayListOfStrings
public static java.util.ArrayList<java.lang.String> toArrayListOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSet
ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the original
String
s of all input normalized strings.
-
toArrayListOfStrings
public static java.util.ArrayList<java.lang.String> toArrayListOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSet
ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the original
String
s of all input normalized strings.
-
toTreeSet
public static java.util.TreeSet<NormalizedString> toTreeSet(java.lang.String... args)
Converts multiple plain strings into aTreeSet
ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toTreeSet
public static java.util.TreeSet<NormalizedString> toTreeSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aTreeSet
ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toTreeSetOfStrings
public static java.util.TreeSet<java.lang.String> toTreeSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSet
ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the original
String
s of all input normalized strings.
-
toTreeSetOfStrings
public static java.util.TreeSet<java.lang.String> toTreeSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSet
ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the original
String
s of all input normalized strings.
-
toHashSet
public static java.util.HashSet<NormalizedString> toHashSet(java.lang.String... args)
Converts multiple plain strings into aHashSet
ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toHashSet
public static java.util.HashSet<NormalizedString> toHashSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aHashSet
ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toHashSetOfStrings
public static java.util.HashSet<java.lang.String> toHashSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aHashSet
ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the original
String
s of all input normalized strings.
-
toHashSetOfStrings
public static java.util.HashSet<java.lang.String> toHashSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aHashSet
ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the original
String
s of all input normalized strings.
-
toLinkedHashSet
public static java.util.LinkedHashSet<NormalizedString> toLinkedHashSet(java.lang.String... args)
Converts multiple plain strings into aLinkedHashSet
ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toLinkedHashSet
public static java.util.LinkedHashSet<NormalizedString> toLinkedHashSet(java.util.Collection<java.lang.String> args)
Converts multiple plain strings into aLinkedHashSet
ofNormalizedString
.- Parameters:
args
- the strings to convert toNormalizedString
- Returns:
- the
NormalizedString
representations of all input strings.
-
toLinkedHashSetOfStrings
public static java.util.LinkedHashSet<java.lang.String> toLinkedHashSetOfStrings(NormalizedString... args)
Converts multiple normalized strings into aLinkedHashSet
ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the original
String
s of all input normalized strings.
-
toLinkedHashSetOfStrings
public static java.util.LinkedHashSet<java.lang.String> toLinkedHashSetOfStrings(java.util.Collection<NormalizedString> args)
Converts multiple normalized strings into aLinkedHashSet
ofString
.- Parameters:
args
- the normalized strings to convert toString
- Returns:
- the original
String
s of all input normalized strings.
-
toLiteral
public NormalizedString toLiteral()
Returns the literal representation of thisNormalizedString
, meaning it will only match with otherString
orNormalizedString
if they have the exact same content including character case and surrounding whitespaces.- Returns:
- the literal representation of the current
NormalizedString
-
toIdentifierGroupArray
public static NormalizedString[] toIdentifierGroupArray(NormalizedString[] strings)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes. Any clashing entries will be converted to their literal counterparts (usingtoLiteral()
), making it possible to identify one from the other.- Parameters:
strings
- a group of identifiers that may contain ambiguous entries if their character case or surrounding whitespaces is not considered. This array will be modified.- Returns:
- the input string array, with
NormalizedString
literals in the positions where clashes would originally occur.
-
toIdentifierGroupArray
public static NormalizedString[] toIdentifierGroupArray(java.lang.String[] strings)
Analyzes a group of String to identify any instances whose normalized content will generate clashes. Any clashing entries will be converted to their literal counterparts (usingtoLiteral()
), making it possible to identify one from the other.- Parameters:
strings
- a group of identifiers that may contain ambiguous entries if their character case or surrounding whitespaces is not considered.- Returns:
- a
NormalizedString
array with literals in the positions where clashes would originally occur.
-
identifyLiterals
public static boolean identifyLiterals(NormalizedString[] strings)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes. Any clashing entries will be converted to their literal counterparts (usingtoLiteral()
), making it possible to identify one from the other.- Parameters:
strings
- a group of identifiers that may contain ambiguous entries if their character case or surrounding whitespaces is not considered. This array will be modified.- Returns:
true
if any entry has been modified to be a literal, otherwisefalse
-
identifyLiterals
public static boolean identifyLiterals(NormalizedString[] strings, boolean lowercaseIdentifiers, boolean uppercaseIdentifiers)
Analyzes a group of NormalizedString to identify any instances whose normalized content will generate clashes. Any clashing entries will be converted to their literal counterparts (usingtoLiteral()
), making it possible to identify one from the other.- Parameters:
strings
- a group of identifiers that may contain ambiguous entries if their character case or surrounding whitespaces is not considered. This array will be modified.lowercaseIdentifiers
- flag indicating that identifiers are stored in lower case (for compatibility with databases). If a string has a uppercase character, it means it must become a literal.uppercaseIdentifiers
- flag indicating that identifiers are stored in upper case (for compatibility with databases). If a string has a lowercase character, it means it must become a literal.- Returns:
true
if any entry has been modified to be a literal, otherwisefalse
-
shouldBeLiteral
private static boolean shouldBeLiteral(java.lang.String string, boolean lowercaseIdentifiers, boolean uppercaseIdentifiers)
-
getCache
public static StringCache<NormalizedString> getCache()
Returns the internal string cache to allow users to tweak its size limit or clear it when appropriate- Returns:
- the string cache used to store
NormalizedString
instances associated with their originalString
.
-
-