Java – Characters


Java – Character Class



”;


Normally, when we work with characters, we use primitive data types char.

Example


char ch = ''a'';

// Unicode for uppercase Greek omega character
char uniChar = ''u039A''; 

// an array of chars
char[] charArray ={ ''a'', ''b'', ''c'', ''d'', ''e'' }; 

Use of Character Class in Java

However in development, we come across situations where we need to use objects instead of primitive data types. In order to achieve this, Java provides wrapper class Character for primitive data type char.

Java Character Class

The Character class offers a number of useful class (i.e., static) methods for manipulating characters. You can create a Character object with the Character constructor −


Character ch = new Character(''a'');

The Java compiler will also create a Character object for you under some circumstances. For example, if you pass a primitive char into a method that expects an object, the compiler automatically converts the char to a Character for you. This feature is called autoboxing or unboxing, if the conversion goes the other way.

Example of Java Character Class


// Here following primitive char ''a''
// is boxed into the Character object ch
Character ch = ''a'';

// Here primitive ''x'' is boxed for method test,
// return is unboxed to char ''c''
char c = test(''x'');

Escape Sequences

A character preceded by a backslash () is an escape sequence and has a special meaning to the compiler.

The newline character (n) has been used frequently in this tutorial in System.out.println() statements to advance to the next line after the string is printed.

Following table shows the Java escape sequences −











Escape Sequence Description
t Inserts a tab in the text at this point.
b Inserts a backspace in the text at this point.
n Inserts a newline in the text at this point.
r Inserts a carriage return in the text at this point.
f Inserts a form feed in the text at this point.
Inserts a single quote character in the text at this point.
Inserts a double quote character in the text at this point.
\ Inserts a backslash character in the text at this point.

When an escape sequence is encountered in a print statement, the compiler interprets it accordingly.

Example: Escape Sequences

If you want to put quotes within quotes, you must use the escape sequence, “, on the interior quotes −


public class Test {

   public static void main(String args[]) {
      System.out.println("She said "Hello!" to me.");
   }
}

Output


She said "Hello!" to me.

Character Class

Declaration

Following is the declaration for java.lang.Character class −


public final class Character
   extends Object
      implements Serializable, Comparable<Character>

Field

Following are the fields for java.lang.Character class −

  • static byte COMBINING_SPACING_MARK − This is the General category “Mc” in the Unicode specification.

  • static byte CONNECTOR_PUNCTUATION − This is the General category “Pc” in the Unicode specification.

  • static byte CONTROL − This is the General category “Cc” in the Unicode specification.

  • static byte CURRENCY_SYMBOL − This is the General category “Sc” in the Unicode specification.

  • static byte DASH_PUNCTUATION − This is the General category “Pd” in the Unicode specification.

  • static byte DECIMAL_DIGIT_NUMBER − This is the General category “Nd” in the Unicode specification.

  • static byte DIRECTIONALITY_ARABIC_NUMBER − This is the Weak bidirectional character type “AN” in the Unicode specification.

  • static byte DIRECTIONALITY_BOUNDARY_NEUTRAL − This is the Weak bidirectional character type “BN” in the Unicode specification.

  • static byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR − This is the Weak bidirectional character type “CS” in the Unicode specification.

  • static byte DIRECTIONALITY_EUROPEAN_NUMBER − This is the Weak bidirectional character type “EN” in the Unicode specification.

  • static byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR − This is the Weak bidirectional character type “ES” in the Unicode specification.

  • static byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR − This is the Weak bidirectional character type “ET” in the Unicode specification.

  • static byte DIRECTIONALITY_LEFT_TO_RIGHT − This is the Strong bidirectional character type “L” in the Unicode specification.

  • static byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING − This is the Strong bidirectional character type “LRE” in the Unicode specification.

  • static byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE − This is the Strong bidirectional character type “LRO” in the Unicode specification.

  • static byte DIRECTIONALITY_NONSPACING_MARK − This is the Weak bidirectional character type “NSM” in the Unicode specification.

  • static byte DIRECTIONALITY_OTHER_NEUTRALS − This is the Neutral bidirectional character type “ON” in the Unicode specification.

  • static byte DIRECTIONALITY_PARAGRAPH_SEPARATOR − This is the Neutral bidirectional character type “B” in the Unicode specification.

  • static byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT − This is the Weak bidirectional character type “PDF” in the Unicode specification.

  • static byte DIRECTIONALITY_RIGHT_TO_LEFT − This is the Strong bidirectional character type “R” in the Unicode specification.

  • static byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC − This is the Strong bidirectional character type “AL” in the Unicode specification.

  • static byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING − This is the Strong bidirectional character type “RLE” in the Unicode specification.

  • static byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE − This is the Strong bidirectional character type “RLO” in the Unicode specification.

  • static byte DIRECTIONALITY_SEGMENT_SEPARATOR − This is the Neutral bidirectional character type “S” in the Unicode specification.

  • static byte DIRECTIONALITY_UNDEFINED − This is the Undefined bidirectional character type.

  • static byte DIRECTIONALITY_WHITESPACE − This is the Neutral bidirectional character type “WS” in the Unicode specification.

  • static byte ENCLOSING_MARK − This is the General category “Me” in the Unicode specification.

  • static byte END_PUNCTUATION − This is the General category “Pe” in the Unicode specification.

  • static byte FINAL_QUOTE_PUNCTUATION − This is the General category “Pf” in the Unicode specification.

  • static byte FORMAT − This is the General category “Cf” in the Unicode specification.

  • static byte INITIAL_QUOTE_PUNCTUATION − This is the General category “Pi” in the Unicode specification.

  • static byte LETTER_NUMBER − This is the General category “Nl” in the Unicode specification.

  • static byte LINE_SEPARATOR − This is the General category “Zl” in the Unicode specification.

  • static byte LOWERCASE_LETTER − This is the General category “Ll” in the Unicode specification.

  • static byte MATH_SYMBOL − This is the General category “Sm” in the Unicode specification.

  • static int MAX_CODE_POINT − This is the maximum value of a Unicode code point.

  • static char MAX_HIGH_SURROGATE − This is the maximum value of a Unicode high-surrogate code unit in the UTF-16 encoding.

  • static char MAX_LOW_SURROGATE − This is the maximum value of a Unicode low-surrogate code unit in the UTF-16 encoding.

  • static int MAX_RADIX − This is the maximum radix available for conversion to and from strings.

  • static char MAX_SURROGATE − This is the maximum value of a Unicode surrogate code unit in the UTF-16 encoding.

  • static char MAX_VALUE − This is the constant value of this field is the largest value of type char, ”uFFFF”.

  • static int MIN_CODE_POINT − This is the minimum value of a Unicode code poin

  • static char MIN_HIGH_SURROGATE − This is the minimum value of a Unicode high-surrogate code unit in the UTF-16 encoding.

  • static char MIN_LOW_SURROGATE − This is the minimum value of a Unicode low-surrogate code unit in the UTF-16 encoding.

  • static int MIN_RADIX − This is the minimum radix available for conversion to and from strings.

  • static int MIN_SUPPLEMENTARY_CODE_POINT − This is the minimum value of a supplementary code point.

  • static char MIN_SURROGATE − This is the minimum value of a Unicode surrogate code unit in the UTF-16 encoding.

  • static char MIN_VALUE − This is the constant value of this field is the smallest value of type char, ”u0000”.

  • static byte MODIFIER_LETTER − This is the General category “Lm” in the Unicode specification.

  • static byte MODIFIER_SYMBOL − This is the General category “Sk” in the Unicode specification.

  • static byte NON_SPACING_MARK − This is the General category “Mn” in the Unicode specification.

  • static byte OTHER_LETTER − This is the General category “Lo” in the Unicode specification.

  • static byte OTHER_NUMBER − This is the General category “No” in the Unicode specification.

  • static byte OTHER_PUNCTUATION − This is the General category “Po” in the Unicode specification.

  • static byte OTHER_SYMBOL − This is the General category “So” in the Unicode specification.

  • static byte PARAGRAPH_SEPARATOR − This is the General category “Zp” in the Unicode specification.

  • static byte PRIVATE_USE − This is the General category “Co” in the Unicode specification.

  • static int SIZE − This is the number of bits used to represent a char value in unsigned binary form.

  • static byte SPACE_SEPARATOR − This is the General category “Zs” in the Unicode specification.

  • static byte START_PUNCTUATION − This is the General category “Ps” in the Unicode specification.

  • static byte SURROGATE − This is the General category “Cs” in the Unicode specification.

  • static byte TITLECASE_LETTER − This is the General category “Lt” in the Unicode specification.

  • static Class<Character> TYPE − This is the Class instance representing the primitive type char.

  • static byte UNASSIGNED − This is the General category “Cn” in the Unicode specification.

  • static byte UPPERCASE_LETTER − This is the General category “Lu” in the Unicode specification.

Class constructors




Sr.No. Constructor & Description
1

Character(char value)

This constructs a newly allocated Character object that represents the specified char value.

Class methods














































Sr.No. Method & Description
1 static int charCount(int codePoint)

This method determines the number of char values needed to represent the specified character (Unicode code point).

2 char charValue()

This method returns the value of this Character object.

3 static int codePointAt(char[] a, int index)

This method returns the code point at the given index of the char array.

4 static int codePointBefore(char[] a, int index)

This method returns the code point preceding the given index of the char array.

5 static int codePointCount(char[] a, int offset, int count)

This method returns the number of Unicode code points in a subarray of the char array argument

6 int compareTo(Character anotherCharacter)

This method compares two Character objects numerically.

7 static int digit(char ch, int radix)

This method returns the numeric value of the character ch in the specified radix.

8 boolean equals(Object obj)

This method compares this object against the specified object

9 static char forDigit(int digit, int radix)

This method determines the character representation for a specific digit in the specified radix.

10 static byte getDirectionality(char ch)

This method returns the Unicode directionality property for the given character.

11 static int getNumericValue(char ch)

This method returns the int value that the specified Unicode character represents.

12 static int getType(char ch)

This method returns a value indicating a character”s general category.

13 int hashCode()

This method returns a hash code for this Character.

14 static boolean isDefined(char ch)

This method determines if a character is defined in Unicode.

15 static boolean isDigit(char ch)

This method determines if the specified character is a digit.

16 static boolean isHighSurrogate(char ch)

This method determines if the given char value is a high-surrogate code unit (also known as leading-surrogate code unit).

17 static boolean isIdentifierIgnorable(char ch)

This method determines if the specified character should be regarded as an ignorable character in a Java identifier or a Unicode identifier.

18 static boolean isISOControl(char ch)

This method determines if the specified character is an ISO control character.

19 static boolean isJavaIdentifierPart(char ch)

This method determines if the specified character may be part of a Java identifier as other than the first character.

20 static boolean isJavaIdentifierStart(char ch)

This method determines if the specified character is permissible as the first character in a Java identifier.

21 static boolean isLetter(char ch)

This method determines if the specified character is a letter.

22 static boolean isLetterOrDigit(char ch)

This method determines if the specified character is a letter or digit.

23 static boolean isLowerCase(char ch)

This method determines if the specified character is a lowercase character.

24 static boolean isLowSurrogate(char ch)

This method determines if the given char value is a low-surrogate code unit (also known as trailing-surrogate code unit).

25 static boolean isMirrored(char ch)

This method determines whether the character is mirrored according to the Unicode specification.

26 static boolean isSpaceChar(char ch)

This method determines if the specified character is a Unicode space character.

27 static boolean isSupplementaryCodePoint(int codePoint)

This method determines whether the specified character (Unicode code point) is in the supplementary character range.

28 static boolean isSurrogatePair(char high, char low)

This method determines whether the specified pair of char values is a valid surrogate pair.

29 static boolean isTitleCase(char ch)

This method determines if the specified character is a titlecase character.

30 static boolean isUnicodeIdentifierPart(char ch)

This method determines if the specified character may be part of a Unicode identifier as other than the first character.

31 static boolean isUnicodeIdentifierStart(char ch)

This method determines if the specified character is permissible as the first character in a Unicode identifier.

32 static boolean isUpperCase(char ch

This method determines if the specified character is an uppercase character.

33 static boolean isValidCodePoint(int codePoint)

This method determines whether the specified code point is a valid Unicode code point value in the range of 0x0000 to 0x10FFFF inclusive.

34 static boolean isWhitespace(char ch)

This method determines if the specified character is white space according to Java.

35 static int offsetByCodePoints(char[] a, int start, int count, int index, int codePointOffset)

This method returns the index within the given char subarray that is offset from the given index by codePointOffset code points

36 static char reverseBytes(char ch)

This method returns the value obtained by reversing the order of the bytes in the specified char value.

37 static char[] toChars(int codePoint)

This method converts the specified character (Unicode code point) to its UTF-16 representation stored in a char array.

38 static int toCodePoint(char high, char low)

This method converts the specified surrogate pair to its supplementary code point value.

39 static char toLowerCase(char ch)

This method converts the character argument to lowercase using case mapping information from the UnicodeData file.

40 String toString()

This method returns a String object representing this Character”s value.

41 static char toTitleCase(char ch)

This method converts the character argument to titlecase using case mapping information from the UnicodeData file.

42 static char toUpperCase(char ch)

This method converts the character argument to uppercase using case mapping information from the UnicodeData file.

43 static Character valueOf(char c)

This method returns a Character instance representing the specified char value.

Methods inherited

This class inherits methods from the following classes −

  • java.lang.Object

Example

The following example shows the usage of Java Character charCount() method. In this program, we”ve created a int variable and assigned it a Hexadecimal value equivalent to a char value. Then using charCount() method, we”ve checked if it is a valid supplementary character or not. Then result is printed.


package com.tutorialspoint;

public class CharacterDemo {
   public static void main(String[] args) {

      // create and assign values to int codepoint cp
      int cp = 0x12345;

      // create an int res
      int res;

      // assign the result of charCount on cp to res
      res = Character.charCount(cp);
      String str1 = "It is not a valid supplementary character";
      String str2 = "It is a valid supplementary character";

      // print res value
      if ( res == 1 ) {
         System.out.println( str1 );
      } else if ( res == 2 ) {
         System.out.println( str2 );
      }
   }
}

Output

Let us compile and run the above program, this will produce the following result −


It is a valid supplementary character

Advertisements

”;

Leave a Reply

Your email address will not be published. Required fields are marked *