2024 Author: Howard Calhoun | [email protected]. Last modified: 2023-12-17 10:16
It is difficult for a modern person to imagine life without the Internet and almost instantaneous access to information sources. The user rarely thinks about how the search for the desired content on the network is carried out. But this is very interesting.
An information retrieval system (IPS) is a complex software and hardware system that selects information at the user's request. Information is stored on servers in digital form, as books used to be on the shelves of libraries. The system consists of many subsystems. Each performs its task in the process of processing the user's request and providing him with information in text or sound form. The multiplicity of tasks to be solved determines the complexity of the architecture of modern information retrieval systems (an abbreviation of the information retrieval system). A kind of "black box": at the input - the text of the request, what is inside - is unknown, at the output - comprehensive information.
Input streams
Requests for information that a person forms in text form on the screen of his gadget,constitute a small part of the requests processed by the search engine. The main arrays of search queries are formed by robots that accept a human request and perform a multi-step search and feedback with the user. Information retrieval systems include well-known Google, Yandex and others, processing millions of requests daily.
Source search objects
The set of initial objects of interest for the search are documents, records, videos, images and more. They are created outside the IPS. The general information storage and retrieval system should have a built-in bibliographic system - a kind of catalog that allows you to search for any kind of objects.
Objects or their digital transformations become an "entry resource" into the IPS. It is among them that the information the user needs is selected.
External sources
Information selection view uses external knowledge sources. This is the information the user is looking for. The title of the movie, a quote from the book, and more. For a computer search, this information must be translated into a query in an algorithmic language. In the IPS, this is done using the block for creating, indexing and developing queries.
Ideally, these three processes-representation, indexing, and query development-should rely on identical sources of knowledge, but in practice, this is not achievable.
Knowledge sources should be constantly reviewed and updated, and the update should be identical andsynchronized. And an external source of knowledge always chronologically precedes its use in search engines for a query, sometimes by several years.
Performances
Representations of the original objects are made up of input data in some combination or transformed in accordance with the rules and algorithms of a particular information retrieval system.
Views are more or less transformed copies of the original search object. In the collection of unedited full texts, each text is its own representation. In the collection of objects of museum exhibits and artifacts, the representation can be a transformed description of the object with its image. In some cases, the representation may be partly derived from the original object and partly from the description: in bibliographic search engines, representations are derived from the object - for example, title, author's name will be combined with the annotation of the work.
Searchable index
Since information in information retrieval systems is stored in the form of a representation, it is logical to assume that the search is carried out according to the representation and, after selection, is given to the user. In practice, this is not the case. For example, the current online library catalogs typically restrict searches to a few fields: author, title, and sub titles within a view that contains other fields that are not searched. This is sufficient reason why it is necessary to distinguisha view and a searchable index, which is the search part of the view. It defines everything that should be searchable. A searchable index, like the view and source object, can be split into separate sub-indexes to provide more precise, targeted searches
Search engines usually have a synthetic structure internally for matching valid search results. This structure is the second component of the searchable index.
Procedurally, the indexing process can be implemented in different ways: a searchable index can be obtained by:
- literally copying a searchable representation;
- by copying the view details. This may be part or all of the views that physically exist only as fragments, distributed according to the rules for creating an index for search, which will be collected when necessary.
Request Design Rules and Formal Requests
Query engineering is a function that mediates between a user query and a formal query. It transforms the user's query, matching it with the retrieval command dictionaries, index specification, and index prior to retrieval. At the dawn of the development of IPS, this role was traditionally assigned to qualified IT specialists.
Developing computer queries that can match dictionary queries into a searchable index system is commonly referred to as the "dictionary input" module. Automation of this function is promising and offers opportunities for expert and probabilistic search methods.
A formal request becomes a formal request after the user's request has been converted. Examples of such formal transformations include truncation, substitution, normalization, vectorization and other transformations of the "external" representation into the "internal" representations of computer IPS (decryption - information retrieval system).
Extracted Document Link Sets
The resulting set of information sources is logically a subset of the views created by the matching rules applied to the formal query by the searchable index.
Usually, but not necessarily, there is a separate sorting process for the recovered set of information. Online library catalogs usually reorder received sets alphabetically by author before displaying. In information retrieval systems that produce strict rankings, ranking order precedes any reordering.
Output streams
Output of search results is done traditionally on the display, more often in the form of a stream of objects to be used elsewhere or for some other purpose, completes the main search loop. Such streams can be sent to visualization devices, storage for further processing, or use as input streams to other selection services.
Information retrieval systems allow feedback fromthe output of any selection process. The output of any process can be feedback to other processes. Feedback can provide the basis for expert judgment at any stage.
Recommended:
Hydraulic system: calculation, scheme, device. Types of hydraulic systems. Repair. Hydraulic and pneumatic systems
The hydraulic system is a special device that works on the principle of a liquid lever. Such units are used in the braking systems of cars, in loading and unloading, agricultural machinery and even in the aircraft industry
Automated enterprise management systems: technologies, program and functions
Automated enterprise control systems - this is exactly what the current industry is in dire need of. Process automation can significantly increase the productivity and efficiency of organizations. In addition, this has become relevant also because at present there is a global computerization of almost all branches of human life
Information and reference system: types and examples. What is an information and reference system?
Dissemination of information, its further collection and processing within modern society is due to special resources: human, financial, technical and others. At some point, this data is collected in one place, structured according to predetermined criteria, combined into special databases convenient for use
Economic information systems: definition, concept and structure
Today, data processing is an independent area with a variety of methods and ideas. Moreover, the individual elements of this process have achieved a high interconnection and a good degree of organization. This makes it possible to combine all information processing tools at a specific economic object, which is called the "economic information system" (EIS)
Visa and Mastercard systems in Russia. Description of Visa and Mastercard payment systems
Payment system - a commonality of methods and tools used for money transfers, settlements and regulation of debt obligations between participants in economic turnover. In many countries, they differ significantly from each other due to the diverse provisions in the levels of economic development and the characteristics of banking legislation