Data mining in recent times has begun to receive a rather negative stigma. While it can be invasive to a person’s privacy there is generally no need for the average person to be overly worried. One of major features of data mining is the ability to process massive amounts of information and piecing together the information derived to put together some kind of connection or draw a conclusion. This is generally done by collecting the “bread crumbs” left by users on the web be it name, address, buying habits, sites visited, etc. While this may seem invasive it would be helpful for users to understand how much of this information is unused in the earlier stages of the algorithms behind these processes. For instance if a company is going to preform a large scale data mining process to find out what items are generally bought together or what demographic to market to, they only require base information about those people. All of this information is generally easy to find as most of these data entries are stored in things such as “cookies” or small files used to store user information about the sites they visit saving short term data.
While many people don’t see past the privacy aspect of data mining they are bound to overlook the massive benefits it has for different businesses and health care associations. This can be shown by examples of diagnostics in the medical field, more specifically with cancers. Specifically data mining provides the ability to sift through many different cases and draw correlations between other cases and the one currently being examined to find similarities, ideally leading to more reliable diagnosis or treatment (Weinstein). The ability to process large amounts of information and output reliable data based on the connections could help to provide users with incredibly useful information despite the possibility of using private data.
The main issue here is generally the finding the happy medium between invasion of privacy and useful/complete bits of information. For instance if a user were to attempt to avoid detection or data tracking they could very easily use a Virtual Private Network (VPN) to provide the user with a temporary IP address. The main issue here is that by doing so the user has almost entirely erased their presence on the different networks they visit and eliminates any information they could provide to data miners to use. This lead me to my possible application technology which would be a “smart” VPN. Essentially what this application would accomplish is integrating a browser with the ability to use a VPN but without erasing the users presence entirely. It would act as a sort of filter and block access to specific bits of information from being tracked while allowing more generic information to be mined. One of the main things to understand with data mining is that the information being mined is sorted by importance to allow the relevant information to make its way to the top for further analysis (Sindhu and Meshram). By this logic what this “smart” VPN could accomplish would be something along the lines of preventing any sites from retaining credit card information but allow the use of purchase history from an online store to allow more personalized advertisements and sales trends. Overall this application has no specific demographic but would be more specifically for the user base looking for a level of privacy higher than a base amount but still able to provide simple information to the corporations who require it.
Owens, J. (2001). John Weinstein discusses information-intensive approaches to cancer drug discovery. In Drug Discovery Today (22nd ed., Vol. 6, pp. 1145-1147). Bethesda: National Cancer Institute National Institutes of Health.
Sindhu, K. K., & Meshram, B. B. (2012). Digital forensics and cyber crime datamining. Journal of Information Security, 3(3), 196-201. Retrieved from http://search.proquest.com/docview/1032966966?accountid=14270