The text data present in images and video contain certain useful information for automatic annotation, indexing, and structuring of images. However variations of the text due to differences in text style, font, size, orientation, alignment as well as low image contrast and complex background make the problem of automatic text extraction extremely difficult and challenging job. A large number of techniques have been proposed to address this problem and the purpose of this paper is to design algorithms for each phase of extracting text from a video using java libraries and classes. Here first we frame the input video into stream of images using the Java Media Framework (JMF) with the input being a real time or a video from the database. Then we apply pre processing algorithms to convert the image to gray scale and remove the disturbances like superimposed lines over the text, discontinuity removal, and dot removal. Then we continue with the algorithms for localization, segmentation and recognition for which we use the neural network pattern matching technique. The performance of our approach is demonstrated by presenting experimental results for a set of static images.