This website uses cookies to ensure you have the best experience. Learn more

Recent Trends In Document Clustering With Evolutionary Based Algorithms

2695 words - 11 pages

Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Previously, a number of statistical algorithms had been applied to perform clustering to the data including the text documents. There are recent endeavors to enhance the performance of the clustering with the optimization based algorithms such as the evolutionary algorithms. Thus, document clustering with evolutionary algorithms became an emerging topic that gained more attention in the recent years. This paper presents an up-to-date review fully devoted to evolutionary algorithms designed for document clustering. Its firstly provides comprehensive inspection to the document clustering model revealing its various components and related concepts. Then it shows and analyzes the principle research work in this topic. Finally, it brings together and classifies various objective functions from the collection of research papers. The paper ends up by addressing some important issues and challenges that can be subject of future work.

The objective function (or fitness function) is the measure that evaluates the optimality of the generated evolutionary algorithm solutions in the search space. In clustering domain, the fitness function refers to the adequacy of the partitioning. Accordingly, it needs to be formulated carefully, taken into consideration that the clustering is an unsupervised process.
Different objective functions generate different solutions even form the same evolutionary algorithm. Presuming also that the fitness could either be a minimization or a maximization function. Moreover, the algorithm could be formulated with one or with multi objective functions. To sum up, "choosing optimization criterion is one of the fundamental dilemmas in clustering" [43].
As a result, the reviewed researches showed diversity in formulating or choosing the fitness functions. We seek to put all of these objective functions in a separate section to make it easy to compare and later develop.
We noticed that the content and web document clustering algorithms used mostly three groups of functions:
- Similarity / distance measure.
- Inter- / intra- clustering or both measures.
- Internal validity index measures.
On the other hand, the keyword/keyphrase clustering algorithms used either generated or statistical fitness functions.
A list of all objective functions for all presented EA-based researches is illustrated in table 1 below. Details of the composing parameters and/or equations are explained beneath it. Followed by describing the type of optimality (minimization/maximization) and the category of the fitness function. We arranged the functions in the same sequence appeared in previous sections.

Document Clustering was the research issue of increasingly various studies. After each stage of these research journeys, there were attempts to combine and classify these studies in reviews or survey papers. A number of these...

Find Another Essay On Recent Trends in Document Clustering with Evolutionary-Based Algorithms

Recent Organizational Problem in the Banking Industry: Problems with Mergers and Acquisitions

1420 words - 6 pages Problems with Mergers and Acquisitions"We launched our offer to give Wachovia shareholders a choice, and the shareholders overwhelmingly endorsed our deal," said Phil Humann, chairman, president and CEO of SunTrust Bank, "Basically the investment community in New York and Boston got cold feet, so we lost. But it was a perfect deal, and if the same circumstances occurred, we'd do it again." At SunTrust Bank and in the banking industry, in general

Levels of Planning with Mc donald's (details the Strategic,strength,weakness and threats and trends with in Mcdonalds corporation)

1111 words - 4 pages current trends. One free down load is offered with the purchase of Big Mack Meals. Low carb sandwiches and drinks are offered to attract dieters watching their carbohydrate intake.In conclusion, there are various factors in planning that determine a company's success. We have focused on strategic planning in this paper because we feel it is the most important level of the four levels of planning to keep a company alive and thriving. Tactical

When Values Clash with Faith: Sex-Education in Religious Based Schools”

1448 words - 6 pages prevention of the contraction of sexually transmitted diseases and pregnancy among sexually active teens, ensuring they are properly educated plays a huge roll in controlling these risks. The abstinence until marriage method of teaching in religion based schools may coincide with ones religion, but when relating to sexually active students it can be misleading and may even cause much higher risks. Teens not only need to be informed on the risk

enhancing performance and security in vanet based traffic message dissemination with pragmetic cost and effect control

1274 words - 6 pages , one particularly attractive value-added application is for commercial service providers (SPs) to promote their businesses with VANET-based ad dissemination scheme. The malicious vehicular nodes may attempt to gain undue personal interests or sabotage VAAD by forging, playing back or/and tampering with ads, or claiming ads that they have not forwarded. Since in a dynamic network when new vehicle (node) enters the network, it should act as relay to

Access Control and Data Updation with AT-PRE: Attribute Time Based Proxy Re-Encryption for Shared Data in Cloud

2111 words - 9 pages accessed by the users is associated with an attribute-based access structure and an access time. The access structure is specified by the data owner, but the access time is updated by the CSP with the time of receiving an access request. The data can be recovered by only the users whose attributes satisfies the access structure and whose access rights are effective in the access time. After specifying the access structure, data owner encrypts and

Ways in which an organisation can manage its human resources in a more cost-effective way. Illustrate your writing with recent real-world examples, clear links between theory and practice

5007 words - 20 pages assignment aims to examine ways in which an organisation can manage its personnel in a cost effective way with real-world examples.2.0 Individuals, Group Dynamics and LeadershipMaximisation of individuals' performance and productivity is a main objective for many organisations, there must be a match between the individual and the organisation, without one will result in poor performance and possible loss of the employee. Individual differences will

Pirate Radio Stations in the 1960s and 1970s Britain (with reference to recent cultural production)

1495 words - 6 pages most lucrative and important for pirate radio, I will explore the governments stances on this issue then, as well as now, and the importance of pirate radio stations today and the way they are represented in other popular media. Introduction From the 1920s most of Britain’s territory was covered by the signal and radio program of BBC which was providing quality informative and educational programs in accordance with Reithian principles

VaR is a method of assessing risk that uses standard statistical tehniques routinely used in other tehnical fields. Based on firm scientific foundations, VaR provides users with a summary measure of...

3789 words - 15 pages poor supervision and management of financial risks. Spurred into action, financial institutions and regulators turnet to value at risk, an easy-to understand metod for quantifying market risk.VaR is a method of assessing risk that uses standard statistical tehniques routinely used in other tehnical fields. Formally VaR measures the worst expected loss over a given horizon under normal market conditions at a given confidence level. Based on firm

Based on "Becoming Visible: Women in European History" by Bridenthal: In early Europe, what are the large trends that you see in the attitudes toward women & their gradual decline of status?

1785 words - 7 pages Attitudes towards Women in Medieval and Early EuropeAs any successful tyrant can enlighten you, it is much easier to rule over the illiterate and low self-esteem citizens of the world, than the highly educated independent thinkers comprising humanity. Women have been imprinted by society for ages with the knowledge that they were “physically, spiritually and intellectually inferior to men” (p. 105), as had been taught by Christian

The Great Goddess Mary Poppins based on the book "Marry Poppins" by Pamela Travers in comparison with the article "Marry Poppins and the Great Mother" by Mary DeForest

1419 words - 6 pages Pamela Travers' book Mary Poppins attracts readers with its themes of magic, mystery and justice as it unifies childrens' timeless dreams of perfection and harmony in the world. The main character of the book, Mary Poppins, has an ability to stimulate the impossible and perform miracles that unite elements of mythology. DeForest refers to Mary Poppins as a goddess-like figure, which he defines to be a character possessing supernatural powers who

Clustering: Keeping Malware Out in Android Applications

592 words - 3 pages Due to the existence of malware samples in large amount of data malware detection techniques are introduced. Machine learning techniques are being applied to classify the applications focusing malware detection. Android has impressive growth in the domain of smart phones. Hence to overcome its better to group malware samples with structural similarities. Clustering technique in Android applications is an important technique in machine learning

Similar Essays

Evolutionary Trends In Organizational Behavior Essay

685 words - 3 pages ). Evolutionary changes are not a sudden an attempt to improve and adjust strategies it is a long process that takes place so that companies and employees can adapt and understand. An example of a gradual improvement is utilizing technology in a better way is utilizing technology in a better way, such as new computer software it has to be installed, debugged and used under a trial period to make certain that the software is compatible with the

Recent Trends In Economic Variables Essay

868 words - 4 pages consumer spending to some extent based on a 3.2 index increase in the last report. More specifically, thanks to the recent spending of the top 15% households comprised by higher income families, according to the report made by Kathleen Madigan of the Wall Street Journal in the article "Vital Signs: The 15%ers Are Feeling Better — and That’s Good for Economy’. However, the article and the chart posted note an important observation regarding the

Recent Economic Trends In The Russian Economy

1987 words - 8 pages observed a steady improvement in its economy after its dismal performance in the year 1998 thus bringing in a sense of stability and building investor confidence.The Russian stock market in 2001 did exceedingly well and achieved a growth rate only second highest to Argentina in the world. Besides the stock market, there were other economic indicators that showed positive trends in the year 2002. There was a 17.4% increase in real wages; inflation for

Is The Institution Of Marriage In Decline? Should Nations Be Taking Actions To Influence Any Recent Trends ?

1106 words - 4 pages The institution of marriage is less and less considered in many countries around the world. There was a significant decline in the number of marriages during the last decades. The British society is one of the most affected by this phenomenon with a decrease almost 50% faster than in other societies. There are several reasons that led us to this situation such as the cost of weddings, the rise of cohabitation and the evolution of our culture