Friday 13 January 2012

How to Use Code Point Methods of Java String get Unicode Characters Example Tutorial

CodePoint method in String is used to get Unicode code point value at index or before index.  In String class we have lot of utility methods for dealing with String like Split , replace and SubString method but here I am discussing a relatively lesser known method codePointAt(), codePointCount() and codePointBefore() ,but before going deep about this method lets first understand what is of code point ,what exactly CodePoint method does and how to use CodePointAt, CodePointBefore methods using java code example.

What is CodePoing in Java?
Code points are the numbers that are used in coded character set where coded character set represent collection of characters and each character will assign a unique number. This coded character set define range of valid code points. Valid code points for Unicode are U+0000 to U+10FFFF.

What CodePoint method of Java String does?

Syntax of the method:
Public   int codePointAt (int index): this method returns the Unicode value of the character specified by the index. This method throw IndexOutOfBoundException if index passed as argument is negative or less than the length of the string.

So now we clear from the syntax that what we need to pass as argument of the method and what it will return.. Now why we need the Unicode value of a particular character so the answer would be Unicode is the character encoding standard designed to clean up the mess of dozens of mutually incompatible ASCII extensions and special encoding and to allow the computer interchange of text in any of the world's writing systems.
Unicode has wide spread acceptance as we know computers just deal with numbers. They store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different encoding systems for assigning these numbers. No single encoding could contain enough characters.

So in short Code Point method helps us to get the Unicode value of particular character which in turns helps in internationalization and localization, Unicode is an international character set standard which supports all of the major scripts of the world, as well as common technical symbols.

No comments:

Post a Comment

Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More