Project Materials

COMPUTER SCIENCE PROJECT TOPICS

EMPLOYING PROBABILISTIC MATCHING ALGORITHMS FOR IDENTITY MANAGEMENT

EMPLOYING PROBABILISTIC MATCHING ALGORITHMS FOR IDENTITY MANAGEMENT

Need help with a related project topic or New topic? Send Us Your Topic 

DOWNLOAD THE COMPLETE PROJECT MATERIAL

EMPLOYING PROBABILISTIC MATCHING ALGORITHMS FOR IDENTITY MANAGEMENT

Chapter one
1.1. Introduction.
The Telecommunications industry is one of the subsectors that comprise the Information and Telecommunications Technology sector. This industry include all telephone providers, Internet Service Providers (ISPs), radio and television firms. Because of the proliferation of gadgets, the telecommunications business has grown in size and complexity.

The telecommunications sector is an extremely profitable enterprise. According to research, the revenue from telecommunications services will increase from $2.2 trillion in 2015 to $2.4 trillion in 2019 as the business expands.

One way to accomplish this is through advertising. Advertisers spend a lot of money to advertise their services. So, there is a need to advertise items and services, particularly to the target user.

When items and services are advertised to their intended audience, there is a greater likelihood of corporations selling and customers purchasing. As a result, it is necessary to determine who is using a device at any one time and what that person is interested in.

Individuals (or households, as the case may be) can now own many gadgets due to the rapid expansion in the number of available options. Identity Management is used to address the demand for user identification at any given time as well as advertising to a specific audience.

Because Identity Management is concerned with persons, many characteristics of individuals are employed to implement it. Individual attributes are divided into three categories: personal attributes, social behaviour attributes, and social connection traits (Li and Wang, 2015).

To carry out effective Identity Management, traits and individuals must be correctly matched. As a result, a matching algorithm is used to find the best match possible.
2
1.2. BACKGROUND STUDY.

1.2.1. About Record Linkage

Halbert L. Dunn coined the phrase “record linkage” in his 1946 paper “Record Linkage,” which refers to the connection of medical information connected with individuals.

Halbert Dunn detailed a technique established by the Dominion Bureau of Statistics in Canada in which information containing individual names from microfiche was transferred to punch cards and then printed for verification and review by various departments in Canada. Compared to manually matching and maintaining paper files, the approaches mentioned above were more efficient and cost-effective.

Howard Newcombe, a geneticist, pioneered computerised record linkage with his papers Automatic Linkage of Vital Records (1956) and Record Linkage: Making Maximum use of the discrimination power of identifying information (1962).

Newcombe’s methods used odds ratios and value-specific frequencies, such as ‘Smith’ having less distinguishing power than ‘Zabrinsky’. Then, in 1969, Fellegi and Sunter formalised Newcombe’s ideas mathematically in their publication A Theory for Record Linkage.

They demonstrated the optimality of Newcombe’s classification rule and proposed other approaches for calculating ‘optimal’ parameters (probabilities used in likelihood ratios) in the absence of training data.

Training data is a set of record pairings for which the true matching status is known, created, for example, by specific iterative review processes that get the ‘true’ matching status for large subsets of pairs (Winkler, 2015).

3. 1.2. Identity Management

Because of the widespread availability and rapid development of technology and web applications, unlawful users have simple access to a variety of programmes. Developers are then encouraged to design more secure environments for applications by allowing for greater control.

Identity management dates back to the nineteenth century, when the government of the United Kingdom made it mandatory for citizens to register new births, and by 1902, the entire United States had been standardised.

In the twentieth century, the United States saw the birth of the first driver’s licence, passport, Social Security number, digital identities and passwords, and commercial internet.

Passwords were created to protect the privacy of individuals and bodies. During that time, identity management consisted primarily on handwritten sheets and other account-tracking systems.

Traditional Identity Management systems were adapted for online applications as soon as the commercial internet came into existence.

The population of internet users expanded to almost 400 million in 2000, increasing the number of vices, such as identity theft, committed by and against people via the internet.

Due to the need to combat these vices, an effective and efficient system was created, and the Identity Management Stack was born. This stack approach had a disadvantage, though: it was incredibly expensive to maintain.

Identity as a Service cloud was founded in 2010 with the goal of simplifying, automating, and lowering the expenses associated with the stack. Identity Management has been fully digitised since 2010 and is now widely used in modern computing.

1.3. Limitations of some works.

Although substantial study has been conducted in the field of Identity Management, most works have focused primarily on individuals’ personal traits, with little or no attention devoted to their behavioural attributes (Li & Wang 2015).

Some works focused solely on supervised and semi-supervised machine learning techniques for probabilistic linking (Diaz-Morales 2015).

Furthermore, probabilistic matching has not been widely employed in the telecommunications business. This paper explores and implements probabilistic matching techniques on data from the telecommunications industry.

1.4. Statement Of the Problem

New products and services are regularly thought of, produced, implemented, and published, and the greatest way to reach individuals is through advertising. The fastest types of advertising are those that use telecommunication equipment.

The audience can see and hear what’s being advertised from billions of miles away. Advertising to the general public is one thing, but advertising to the intended audience is quite another.

If the target individual is not reached, sales of such services will be minimal, resulting in a poor profit or even loss for the advertising company and the telecommunications industry as a whole.

In addition, for identity management, an individual’s personal and behavioural qualities should be considered. In addition, all machine learning techniques should be used, with the one that produces the best match being noted.

Need help with a related project topic or New topic? Send Us Your Topic 

DOWNLOAD THE COMPLETE PROJECT MATERIAL

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Advertisements