Is there so much data now on the internet that it’s actually becoming harder to find the information you seek? As scientific research continues, the quantity of information (“data glut”) on the internet expands. Quality of information is another issue. Will it become increasingly harder to identify and reach the cutting edge in a given field? If the pace of scientific innovation and the accumulation and integration of knowledge continues to accelerate, as Ray Kurzweil suggests, will it reach a point where groups developing different or similar technologies will become incapable of keeping up with each others’ innovations? Will the research efforts of human society become less efficient, with more duplication of efforts, as we go forward? Is this already occurring?
Back in the days of paper-only publication the information flow was held to a manageable pace, and there was less information to manage. As books and journals proliferated over the past few centuries, and given their relatively slow delivery time compared to the internet, the difficulty of keeping up only gradually increased. Since the advent of electronic communications, and especially the internet, not only has the accessibility of information improved, but the total quantity has increased exponentially. Add to that the increased specialization in the sciences and an increase in the number of research programs, and the total information available has exploded. Are widely separate but similar lines of research developing so rapidly as to create a chaotic state in which researchers in a given field fail to connect with a significant number of their peers, resulting in redundant developments? Or has specialization allowed researchers to focus more tightly on the work that parallels their own so that a relatively coherent “cutting edge” persists in the majority of fields?
The internet has created a tradeoff between increased information availability and data overload. When I worked in the development of high speed image processing computers in the early 1980’s, at a time when the internet was nowhere near as useful as it is today, it seems we were no better off, but perhaps no worse off than today. We were unable then also to see a lot of the results of research funded by governments and corporations, and, in addition, had to put up with publication delays that don’t exist on the internet. At the same time, however, we weren’t overwhelmed by the sheer quantity of information as we are today. Our director was responsible for assessing competing organizations doing similar research, and came from an academic environment. Some of the team read related books and papers. As a young prototype builder I was focused on my work, and was not invited to participate in interactions with other organizations, but I did maintain the technical library for the engineers (99% paper, of course).
Today’s internet searches produce overwhelming quantities of results. If you search on any popular scientific topic, like “nanotechnology” (18.2 million finds on google on August 27th, 2008) or “supercomputer development in 2008” (2.26 million finds on google on September 2nd, 2008), you get an overwhelming number of citations. No person or organization can review all of those finds in a short enough time to keep up with daily additions and developments, and, if Ray Kurzweil is correct, the rate of creation of new information will continue to grow. When language barriers, some governments’ restrictions on internet access, and the secrecy of a lot of the government-funded and corporate-sponsored research are thrown in, to name a few variables, it suggests there already may be considerable redundancy in scientific work worldwide. After all, good ideas often occur to multiple people around the same time in history. Perhaps redundancy doesn’t signify a particularly lower level of overall efficiency when there are so many people and organizations doing the work, and hopefully different groups are discovering different things, so redundant efforts are compensated for by sheer numbers.
Commercialization of search engines affects search results. Search engines have to make money to support the continually expanding demand for their services, enhance their tools, and provide new and higher capacity hardware. One way at least some of them have found to do this is to sell preferred ranking, thus allowing their results to be skewed towards commercial ventures and other interests willing to pay for enhanced visibility. While search engines are the best way to find out what new developments are occurring via the internet, this may make them less than perfect for the job. I wonder how often the information of highest value in an area being researched actually shows up in the first thousand citations.
Are personal contacts and conferences the most effective means of getting to the cutting edge? Personal networking, association with appropriately-selected universities and research labs, and attendance of appropriate conferences still work today, and may still be the best way of reaching the cutting edge in any field of study, but is it an illusion that the internet makes reaching the forefront of a topic easier? I contend that, via the internet, one can only approach currency in a topic, and not get all the way to the cutting edge.
Is the internet data glut severe enough to slow progress or decrease our scientific efficiency as a society? As the internet achieves huge increases in content (as evidenced by search engine results), it only becomes more difficult to find the knowledge we want. As a result, it is possible that research teams in any given field will have increasing trouble connecting with each other. Will the risk that organizations will be unaware of progress by others increase? Will increasing amounts of duplication occur? I wonder if we could see a slowing of scientific progress for the amount of resources being committed, increased blurring of the cutting edge, and a decrease in “scientific efficiency” across human society as a whole.
As always, I welcome your comments. – Tim
The Structure of Scientific Revolutions, Thomas Kuhn, 1964