Package org.htmlcleaner
Class Utils
- java.lang.Object
-
- org.htmlcleaner.Utils
-
public class Utils extends Object
Common utilities.
Created by: Vladimir Nikic
Date: November, 2006.
-
-
Field Summary
Fields Modifier and Type Field Description static PatternDECIMALstatic PatternHEX_RELAXEDstatic PatternHEX_STRICT
-
Constructor Summary
Constructors Constructor Description Utils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static StringdeserializeEntities(String str, boolean recognizeUnicodeChars)static StringescapeHtml(String s, CleanerProperties props)Escapes HTML stringstatic StringescapeXml(String s, boolean advanced, boolean recognizeUnicodeChars, boolean translateSpecialEntities, boolean isDomCreation, boolean transResCharsToNCR, boolean translateSpecialEntitiesToNCR)change notes: 1) convert ascii characters encoded using x; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert xx; format characters to " style representation if available for the character.static StringescapeXml(String s, boolean advanced, boolean recognizeUnicodeChars, boolean translateSpecialEntities, boolean isDomCreation, boolean transResCharsToNCR, boolean translateSpecialEntitiesToNCR, boolean isHtmlOutput)change notes: 1) convert ascii characters encoded using x; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert xx; format characters to " style representation if available for the character.static StringescapeXml(String s, CleanerProperties props, boolean isDomCreation)Escapes XML string.static StringfullUrl(String pageUrl, String link)Calculates full URL for specified page URL and link which could be full, absolute or relative like there can be found in A or IMG tags.static StringgetXmlName(String name)static StringgetXmlNSPrefix(String name)static booleanisEmptyString(Object o)static booleanisFullUrl(String link)Checks if specified link is full URL.static booleanisIdentifierHelperChar(char ch)Checks if specified character can be part of xml identifier (tag name of attribute name) and is not standard identifier character.static booleanisValidXmlIdentifier(String s)Checks whether specified string can be valid tag name or attribute name in xml.static booleanisValidXmlIdentifierChar(char ch)Checks if character can be identifier part for an XML attribute or element.static booleanisWhitespaceString(Object object)Checks whether specified object's string representation is empty string (containing of only whitespaces).static Stringltrim(String s)Trims specified string from left.static Stringrtrim(String s)Trims specified string from right.static StringsanitizeXmlAttributeName(String attName, String prefix)Attempts to replace invalid attribute names with valid ones.static String[]tokenize(String s, String delimiters)
-
-
-
Method Detail
-
isFullUrl
public static boolean isFullUrl(String link)
Checks if specified link is full URL.- Parameters:
link-- Returns:
- True, if full URl, false otherwise.
-
fullUrl
public static String fullUrl(String pageUrl, String link)
Calculates full URL for specified page URL and link which could be full, absolute or relative like there can be found in A or IMG tags. (Reinstated as per user request in bug 159)
-
escapeHtml
public static String escapeHtml(String s, CleanerProperties props)
Escapes HTML string- Parameters:
s- String to be escapedprops- Cleaner properties affects escaping behaviour- Returns:
-
escapeXml
public static String escapeXml(String s, CleanerProperties props, boolean isDomCreation)
Escapes XML string.- Parameters:
s- String to be escapedprops- Cleaner properties affects escaping behaviourisDomCreation- Tells if escaped content will be part of the DOM
-
escapeXml
public static String escapeXml(String s, boolean advanced, boolean recognizeUnicodeChars, boolean translateSpecialEntities, boolean isDomCreation, boolean transResCharsToNCR, boolean translateSpecialEntitiesToNCR)
change notes: 1) convert ascii characters encoded using x; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert xx; format characters to " style representation if available for the character. 3) convert html special entities to xml xx; when outputing in xml- Parameters:
s-advanced-recognizeUnicodeChars-translateSpecialEntities-isDomCreation-- Returns:
- TODO Consider moving to CleanerProperties since a long list of params is misleading.
-
escapeXml
public static String escapeXml(String s, boolean advanced, boolean recognizeUnicodeChars, boolean translateSpecialEntities, boolean isDomCreation, boolean transResCharsToNCR, boolean translateSpecialEntitiesToNCR, boolean isHtmlOutput)
change notes: 1) convert ascii characters encoded using x; format to the ascii characters -- may be an attempt to slip in malicious html 2) convert xx; format characters to " style representation if available for the character. 3) convert html special entities to xml xx; when outputing in xml- Parameters:
s-advanced-recognizeUnicodeChars-translateSpecialEntities-isDomCreation-isHtmlOutput-- Returns:
- TODO Consider moving to CleanerProperties since a long list of params is misleading.
-
sanitizeXmlAttributeName
public static String sanitizeXmlAttributeName(String attName, String prefix)
Attempts to replace invalid attribute names with valid ones.- Parameters:
attName- the attribute name to fixprefix- the prefix to use to indicate an attribute name has been altered- Returns:
-
isIdentifierHelperChar
public static boolean isIdentifierHelperChar(char ch)
Checks if specified character can be part of xml identifier (tag name of attribute name) and is not standard identifier character.- Parameters:
ch- Character to be checked- Returns:
- True if it can be part of xml identifier
-
isValidXmlIdentifier
public static boolean isValidXmlIdentifier(String s)
Checks whether specified string can be valid tag name or attribute name in xml.- Parameters:
s- String to be checked- Returns:
- True if string is valid xml identifier, false otherwise
-
isEmptyString
public static boolean isEmptyString(Object o)
- Parameters:
o-- Returns:
- True if specified string is null of contains only whitespace characters
-
getXmlNSPrefix
public static String getXmlNSPrefix(String name)
- Parameters:
name-- Returns:
- For xml element name or attribute name returns prefix (part before :) or null if there is no prefix
-
getXmlName
public static String getXmlName(String name)
- Parameters:
name-- Returns:
- For xml element name or attribute name returns name after prefix (part after :)
-
isValidXmlIdentifierChar
public static boolean isValidXmlIdentifierChar(char ch)
Checks if character can be identifier part for an XML attribute or element.- Returns:
- true is may be identifier part, false otherwise.
-
isWhitespaceString
public static boolean isWhitespaceString(Object object)
Checks whether specified object's string representation is empty string (containing of only whitespaces).- Parameters:
object- Object whose string representation is checked- Returns:
- true, if empty string, false otherwise
-
-