From Information to Knowledge- Do Universals Exist- Are They Really Necessary ?




          The fundamental problem of epistemology is to identify and evaluate the processes that lead from information to knowledge and reason. The problem is not system specific. It is not limited to human cognition nor even to the behavior of biological systems. We know or at least can imagine what some of these processes are like. Information reduction is a key process and would seem to be fundamental at least for biological systems. Perhaps the many become the one through information reduction, certainly the stimulus can be identified by reducing the level of information it contains. Sensory universals at least, if they exist at all, are produced by reducing information. So, two important issues require investigation. First: Do universals really exist? Second: Are they really necessary?


          Not every system output can be the result of some reduction, summation or simplification process. Sometimes the stimulus is too small and must be amplified to generate the complex motor behavior that results from it. Perhaps noise in the system can also generate output under some circumstances. But an agent system should have a strong bias against any sort of accidental behavior; it is expensive in terms of wasted energy and potentially dangerous.  It certainly seems that biological systems have a negative bias, i.e., toward inactivity and inhibition. Except for curiosity driven exploratory activities, most actions are probably purposeful. And even curiosity itself can be useful. The requirement for overall system control originates at the lowest level of system operations in the neuron itself. Neurons are too sensitive for their- and our- own good, they have to be inhibited at least at the network level. This may be why there is so much GABA –an inhibitory agent- in the brain.


          The price for small size and sensitivity is that inhibition is required, we should suspect, and much of what the CNS does is probably centered around the generation and control of inhibitory activity by inhibitive centers and processes. There is a tendency to imagine that the CNS is passive and reactive, waiting around for to stimulus strong enough to activate it. The truth must be closer to the opposite, the nervous system is too active and effective reaction, at least, must be controlled. Stimuli are excuses to act and probably work by disinhibition rather than simple excitation. The CNS in particular may react by inhibiting the inhibition, that is, by inhibiting centers that normally themselves inhibit system or network function. These sorts of consideration suggest a very interesting epistemological hypothesis, that knowledge and reason are matters of negation rather than affirmation. That it is not what things are but rather what they are not that counts at the most basic functional level.



This idea is probably manifest at the sensory level even if its import at higher levels is uncertain.  It is generally thought that the business of sensory

systems is recognition which involves something like feature extraction and feature summation to determine or recognize a stimulus. But the real business of the sensory systems is probably something like differentiation –determining the differences between stimuli of the same type. Exact or unique recognition is probably irrelevant unless the stimulus itself makes it relevant. To put it crudely, it generally is unnecessary to determine exactly what something is; a “close enough” standard may be good enough. The difference here is in the level of feature extraction and integration and whether or not it represents the limits of resolution for the differentiating system.


          We are concerned here only with information based systems, mostly because we don’t yet understand if a cognitive system that doesn’t use information is possible. The underlying problem here is that we tend to think that knowledge invariably involves universals of some sort. We regard the “many into one” phenomenon as the essential element of intelligence. But is subsumtion really necessary? If both the input and output spaces of the system are large or inclusive enough then the system may effectively say not that “ this is off that type” i.e., “partakes” of a given universal but rather may judge that “this is the same as one other.”  Now this “sameness” would just seem to invoke a single universal and perhaps this is the essential difference in a “best fit against examples or numerous de facto paradigms” vs, “instance of a universal” based approach to successful interaction with the world.


          Whether or not there can be a “sameness” based system that works as well as a system that uses many universals depends a lot, unfortunately on many hardware questions as well as fundamental considerations on the nature of information reception, transduction, processing and storage. We will probably come to see it as a distinction without a difference. No system can internalize all of the external world and the difference between rounding off and smoothing over the signal and reducing it to a universal form will perhaps always be hard to distinguish.  These issues also depend on what we regard to be information which off course is always going to be system dependant. We should note here also that in neurological systems information is probably best not viewed as simply action potentials or even streams of action potentials, but rather as network activity.


But now we see that our original formulation of the problem was somewhat erroneous. If virtually anything can count as information, again it is system dependant, and if no system can internalize everything then all intelligent systems must use information and the real issue is how many information reduction universals the system develops and uses. If the real problem is differentiation then the question is how much to do and how much feature extraction is required to do it.  We know for instance of Hubel and Weisel’s line detectors and the “grandmother cells” of higher level sensory function which have very specific stimulus requirements for activation. The general assumption seems to be that higher level recognition units summate lower level feature extraction units probably at a high or maximal level of resolution. But do we really know this? The business of the CNS, as opposed to that of the sensory systems alone, is to map the input space into the output space and under the appropriate global control influences, to produce actions. Behavior is what the CNS is all about, not figuring out what’s out there.  If the system has a bias toward inactivity, then most stimuli will not present as action worthy. Only externalities reaching some critical values for the system will produce behavior and the response does not require high resolution of all aspects of the stimulus. (But our concerns here are agents of all types and not biological systems alone.)


          (Suppose someone points at a stone and says: “This stone is thinking about Vienna.” Everyone who hears and understands these words and sees the pointing action will immediately know that the speaker is uttering nonsense or making a joke or something. (It would be interesting to find out how long it actually takes to reach this realization or judgment.). How can this happen quickly?  The conventional theories of knowledge or cognition would suggest that there is a reasoning process that goes something like this: “Thinking requires a mind or a brain, something which stones do not and cannot have. Therefore, the proposition is false or ridiculous.” But this sort of justification for a judgment is not necessarily the thing that produces the behavior that manifests it, to assume otherwise is to introduce a theory.  The reasons one can give for saying something are not necessarily the causes for saying it.  They are just another element of related behavior.


          Now we must know or assume something to deny suggestions about thinking stones, but what form does this knowledge take and how does it produce the reaction to the ridiculous suggestion?  The respondent will necessarily have had exposure to talk about thinking or more generally to linguistic behavior related the word “ thought “ and its cognates.  These experiences will result in memory traces involving the word or related words. Is the thinking stone proposition compared against all these memory traces more or less simultaneously or is it evaluated in terms of  a “ thought” universal  abstracted from them? Is judgment based on reason or history? What kinds of processes produce “concepts”? Believers in concepts-as opposed to those who correctly point out that we can articulate or explain at least some concepts- have a lot of explaining to do.)


          Diagrammatically the problem generated by at least some universals looks like this:

          The many –different visual forms for the capitol letter “A”- are reduced to one, a single output variable representing the one and only “A” known to the higher levels of the system. This is probably the model that was used by earlier theories of recognition processes.  Contrast it with this model:


Here there are different processing events each resulting in a different final stage output to the rest of the system. While these outputs are different, they are functionally equivalent to the rest of the system. There is no single capital “A” universal that is responsible for the “A” related behavior produced by the system. In stead there are many “A” exemplars all introduced by other considerations, probably contextual like having appeared in front of “….pple”


          Behavior based on instantiation rather than universals can probably not evolve or be learned in the conventional sense of human learning. It has to be installed rather than acquired. The problem is that the system would have no mechanism for distinguishing or even a basis for inclusion of the first instances. So, the non discriminatory system would tend to load up on junk information and acquire patterns of dysfunctional behavior and unless memory and processing power were essentially infinite the whole system would become dysfunctional. Biological systems of course are drive determinate, at least at the most basic operational levels, so perhaps the instantiation system could employ something like the drives of living systems though it is hard to imagine what these would be like. How does the instantiating system terminate the analytical work? How does it know when the analysis is good enough?


          Our ultimate concerns here are with two questions: Are all epistemic systems information based? And if so, do they require something like universals? Are these requirements for knowledge, to be able to successfully interact with the world, or are they only a requirement for certain types of hardware?  Imagine a system that learns by periodically sampling the environment and recording the sample or its information content in memory. Suppose this system worked according to an instantiation type of recognition process. In other words, its basic operating mode was not predication where it is decided that “this is of that type..” but rather was something like deciding that “this is sufficiently like one other”.   In other words it follows a “best fit” or likeness theory of knowledge. Could such a system really work?


          We must at all times remember that we are ultimately concerned with system output or behavior. To say that the system is successful or “works” is to evaluate the output according to certain criteria, and these criteria are not necessarily those of traditional epistemology. They may be criteria set or determined by the system itself. Suppose the system is to determine whether or not it is looking at an edible apple. Now biological systems do this by sampling or testing in certain ways .They look at the apple and determine its color, red is good, green is bad and inedible unless you are very hungry. Or they could take a bite and see what happens.  In either case biological systems are probably making an information based decision based on identifying redness or sweetness and employing generalizedl forms of these criteria acquired from previous experience- or so the theory goes.


How would our instantiating system work?  Suppose it stored 100 visual impressions of apples in its memory, these ranged from dark green through yellow and red apple shapes. It could then compare the visual image of the current apple with the visual images stored in memory to determine if the current apple was acceptable. But what could “acceptable” mean here? Suppose we are talking about an apple picking robot. We tell the robot that memory images 1 through 25 represent good apples; anything else is to be left on the tree.  We tag each image in memory with the “acceptable” criterion and we don’t attempt to install a redness universal to be used in decision making. The robot then compares the current visual impression with an image stored in memory and decides based on similarity to one or more of the stored images whether or not it is looking at a good apple. The process either produces the “acceptable” output tag or it does not.


 This could be an analogue process, like comparing two photographic negatives with a analogue spectrometer. But suppose it is an information based process. Suppose the robot uses digital information to decide the issue, suppose its brain is a digital computer. Does this imply that the robot is using universals? Not unless we are willing to say that each apple exemplar stored in memory is itself now a universal or some how represents a universal. But they are not, they are particulars- unless information production and processing always results in universals. Is the stored information from each apple instance now a universal? Does the robot simply have more red universals than do biological systems that do the same selection job? This question is probably resolvable via a reductio argument. Suppose the robot has a million or a billion apple images to work with.  Does it then have a million or a billion universals? The many are supposed to become one, not one of a billion. We really don’t know if there are any universals at all, it depends on how the system works. Without knowing this, talk about universals is just empty noise.  



RCE - 7/12