African
Journals Online
South African Journal of Information Management
Volume 3, Issue 2, September 2001
Abstracts
Automatic extraction and analysis of financial data from
the EDGAR database
Leinemann, C.Schlottmann, F.Seese, D.Stuempert, T.
Abstract: In this article the authors discuss a new
methodology of extracting financial data from the Electronic Data
Gathering, Analysis and Retrieval (EDGAR) database of the
Securities and Exchange Commission (SEC) which contains financial
information of about 68,000 companies. In documents of this
database, for example 10-K or 10-Q filings, the beginning of a
balance sheet or income statement for a single company and a
single year is sometimes introduced with some SGML tags and the
financial data itself like balance sheet items are in pure ASCII
format. We introduce text mining procedures to detect relevant
financial data in these documents. This is accomplished by
dextrapi (data extraction API), a wrapper for extracting
information from any text-based source. The extracted information
is then transformed into machine understandable XML syntax
enabling and supporting quick trading decisions of stock market
investors. The advantage of dextrapi over existing wrappers, for
example the World-Wide Web Wrapper Factory (W4F) or the Java
Extraction and Dissemination of Information (JEDI) wrapper, lies
in its ability to adapt the extraction process on the
semistructured input whereas most other wrappers rely on fixed
data formats for extraction (e.g. extracting only HTML
documents). Furthermore we introduce Edgar2xml, a software agent
based on dextrapi wrapper enabling to automate the process of
extracting and evaluating balance sheet data and related
information from the EDGAR database. Evaluation is done with XML
output which conforms to an XML schema, that is a set of rules
for describing the underlying document structure of the XML
document.
Staying abreast with information published in digital
sources
Van Brakel, P.A.Mafa, N.C.
Abstract: Soon after the invention of the World-Wide
Web, with its terrabytes of information and millions of Web
sites, all the role players in the digital information
environment realised how difficult it was not only to find
relevant and precise information via the Web, but also to stay
abreast of the megabytes of information being added to these Web
sites on a daily basis. This is an effort to categorize the
various approaches that have been developed until recently to
assist end-users in identifying and evaluating recently published
information. Approaches such as Web casting, Web personalization,
collaborative filtering, Web tracking services and more are
identified, described and illustrated by a number of screen
displays.
Current standards for an effective portal application in an
enterprise with specific reference to Woolworths
Van Biljon, S.
Abstract: In this case study, criteria for corporate
portals are explained and their application in the Woolworths
intranet highlighted.
Investigation of a Web-based expert system shell
Vogts, D.
Abstract: A Web-based expert system shell for the
identification problem is discussed. In the article there is a
brief explaination of the identification problem and the a
solution that was implemented. The ability to change the user
interface at will is a feature of the expert system shell and it
is discussed how this was implemented in the system. Lastly the
ability to link expert systems together into a network is briefly
discussed and some possible applications are named.
|