pica10

- pause: ideas in motion

The Dark Side of Google:Chapter 7 Technocracy

No Comments »

NB this book and translation are published under Creative Commons license 2.0 (Attribution, Non Commercial, Share Alike).
Commercial distribution requires the authorisation of the copyright
holders: Ippolita Collective and Feltrinelli Editore, Milano (.it)

Translation: Patrice Riemens

Analysis of the Google phenomenon reveals a colorful landscape, in which the economy of search is but one element within a far larger and more complex mosaic. Eric Schmidt himself states that Mountain View is setting up the foundations for a global information technology enterprise, a “One Hundred Billion Dollars Business”, obviously something that is more than a mere search engine firm.

What it actually is, is an invasive knowledge management system, whose most significant development and methods we have sketched {in the previous chapters}: strategies that pair aggressive marketing with smart image building, propagation of highly configurable {and personalisable}, and yet always recognizable interfaces, creation of standard contents ‘outsourced’ to users and developers, adoption of development methods straight out the co-operative Free and Open Software handbook, use of state-of-the-art data capture and archival systems, information retrieval systems associated with {advanced} profiling techniques, both implicit and explicit, and {last but not least, sophisticated} personalisation of advertisements.
Technocracy or the experts of science

Experts have found in the control and manipulation of technology the ideal tool to maintain their power, impose their personal interests {upon society}, or acquire more privileges. The mechanism is {absurdly} simple: technology is being (re)presented not only as the guarantor of the objectivity in scientific research, it is also used to validate the decisions of politicians in power, or more generally, those of any ‘authority’ that has access to the technological oracle.

The application of scientific research in its technological form is excessive, yet reality is constantly being interpreted according to that paradigm. The curiosity and desire for knowledge that inspire scientific research are being hampered by ill-informed profitability criterions which are the hallmark of {contemporary} private and public funding. If research does not come up with immediate profits generating  technological artifacts, it is deemed uninteresting. The power’s discourse then becomes technocratic, {completely} at the opposite end of community-oriented sharing, of self-management, of dialogue and mediation between individuals. To sell Google’s technocracy as if it were a tool for direct democracy is a charade, meant to make us believe that we participate in some sort of grand electronic democracy game, when it is completely devoid of substance. Sure, we may publish what we want on the Internet, and Google shall index us. But us, who are ‘dilettantes’ and ‘heretics’, are not allowed to mention that Google’s accumulation strategy resonates remarkably well with the market economy /system/, which is based on endless growth. This makes sense, since we are neither alumni of the London School of Economics, nor successful entrepreneurs, and us are not {certified} experts either. Hence we have no ‘authority’ whatsoever. Yet, sound common sense and Orwellian memories are more than enough to realise that such a growth, without end or aim, is the manifestation of the will to technological power that only consider {human} individuals as potential consumers {and nothing else}.

That is why PageRank[TM], which, as we have seen, is not merely an algorithm, becomes the cultural prism through which Google intends us to analyse everything. In a certain sense, what we witness, is an enforced extension of the peer review system – which works all right within the academic system – to the whole gamut of human knowledge.

Traditional authorities, like religious or political institutions, have hit rock bottom as far as their credibility is concerned. Nonetheless, their loss of grip on reality, far from having favoured the blossoming up of autonomous spaces has led to an unreal situation where no assertion can be hold for true unless validated by some sort of technological authority.

The authority of machines [computers?], in most cases, is only a query return from a data base being dished out by the high priest of technology and {assorted} experts to the wealthy class of ‘prosumers’. An extreme form of relativism is the hallmark of methods which pretend to extract ‘the truth’ out of the available and {allegedly} boundlessly numerous data, as one can surmise from the number of algorithms and filters that have been used [?]. The true meaning of ‘any search an appropriate answer’
is {actually}: ‘a personalised product to every consumer’.

Confronted to this closure of creation, and of the management and application of knowledge at our expense, there appear to remain only two options: refusal of scientific culture considered as the root of all evil; or on the contrary, blind and enthusiastic acceptance of every ‘innovation’ brought forth by technology.  However, between {these|} two extremes, techno-hate and techno-craze, it should be possible to advance the curiosity which is associated with the hacker ethic, viz. the sharing of knowledge, the critical attitude towards ‘truths’, the rigorous verification of sources, all that while going for the way of open knowledge {and free circulation of information}.

Education is then a fundamental issue in this context, but the ways to disseminate scientific knowledge on a large scale are {simply} not there. Educational structures in Europe as well as in North America are only geared towards the production of specialists. As of today, no pedagogic model exits that would correspond to a demand for a ‘dilettante’ kind of scientific {approach to} knowledge, not even in countries with a non-western tradition like Brazil or India, which are nonetheless producing high level scientific research and state-of-the-art technology at low costs ‘thanks’ to unremitting {international} competition. A scientific activity that would be neither academic nor entrepreneurial, but decentralised and of a DIY kind is nowhere on the agenda, despite the fact that it is indispensable to foster basic competences and the ability to evaluate the technological innovation which concern all of us. More specifically, the {whole} notion of ‘scientific culture’ would need to be appraised afresh to cater for the all-round need to have an elementary command of what is needed to confront the technological tsunami that engulfs us.

The rise of Information technology to the status of main mover of technological innovation makes new scenarios possible: IT is not merely a technique to automatise the management of information, it also has a logic of its own, meaning that it constantly strives to alter its own underpinnings. IT is all at once material, theoretical and experimental.

IT applies to the formalisation of language (and hence contributes to the formalisation of knowledge), and puts that to work with the physical components of electronics, developing from there languages which in their turn influence theories of knowledge. IT functions as a loop of sorts, following a very particular cyclical process.

In classic sciences one observes stable phenomenons: the science of physics, for instance, constructs natural data and create relevant theories. But  with IT, {and its derivate, computer science} the phenomena theory helps identify are {wholly} artificial, they continuously change, both in nature and conceptually, in the same time and measure as theoretical and experimental advances make them more refined: the software that was developed on a computer ten years ago will be structurally different from one that has been developed the last month. What we held for true yesterday, we know today that we won’t hold it for true tomorrow, when we will have more powerful machines that will do novel things: this is a living world as it where, and hence in a constant state of becoming.

(to be continued)

Share

The Dark Side of Google: Chapter 6. Quality, Quantity, Relation. Continued

1 Comment »

NB this book and translation are published under Creative Commons license 2.0 (Attribution, Non Commercial, Share Alike).
Commercial distribution requires the authorisation of the copyright
holders: Ippolita Collective and Feltrinelli Editore, Milano (.it)

Translation: Patrice Riemens

The Myth of instantaneous search

Since it is clear that Google’s data ‘capital’ , gigantic as it is, will never correspond to the totality of {the information present on} the Web, presenting oneself as an ‘instantaneous’ interface, bridging the gap
between the users search intentions and the so-called ‘exact’ result smacks of naivety – or of deceit.

Since the Web consists of nodes (pages) and arcs (links), every time one browses it by visiting pages, one is follows up links constituting a trajectory analysable through the mathematical models of the graph theory.

The pre-set orientations search engines will propose us will always lead us to the ‘right’ object, indifferent of the dimensions the Web might have or get {in future}. By applying efficiency and efficaciousness criteria, a search engine will chart out of query the ‘optimised’  trajectory, meaning that the number of nodes hit will be low, and {hence} the time taken by the search will look nearly instantaneous. Google actually pushes in the direction of one single trajectory, something illustrated by the “I’m feeling lucky” button on its main page.

This ‘optimisation’ squeezes search into a three pronged sequential scheme: user-algorithm-goal. On the long term, this dynamic leads to ‘digital passivity’, a stage where we simply wait till results are brought
to us, for us to choose among them.

Moreover, this efficiency/ efficaciousness is paradoxically grounded not on an increase in the size of the data pool where searches are conducted, but on its opposite, on a limitation of the access to the information ‘capital’ , since no trajectory proposed by the search engine will ever take place in real time [French 'the moment t'] on the network, but will be calculated first according to what has actually been archived, and the user personalisation obtained through filters and cookies.

The access to the information offered by Google is fast, very fast, and looks even immediate, to the point of suggesting the annihilation of time, and to imply the existence of an immensity of data that have been perused for the purpose. The mediation of technology (through interfaces, algorithms, pre-set searches, etc.) makes this temporal ‘annihilation’ possible as well as {the feeling of} practically immediate access [*N4].

The rapidity of results return, however, has a detrimental [Indians would say:'deliterious ;-) ] effect on the quality of the search. As everyone is aware who has conducted (re)search herself, the time one spend on
(re)searching is a determinant element of the experience: to map out one’s own {research} path, to make choices according to the moment, all this generate a feeling of being into it and is {deeply} satisfying. Google allows us to ‘localise’ in space (that is its own multidimensional space) what we want, but, however brief the time spent waiting for the result, we always adopt a passive attitude in front of the technological oracle.

In an active (re)search drive, the aim is no longer about ‘access’ to the data, but to accomplish a rich and variegated journey, and to use the (re)search endeavour for mapping out complex trajectories. Efficiency as a concept vanishes. The larger the number of visited nodes, the greater the complexity of the interlinkages we conceive, the more numerous occasions will be to trigger significant choices, and to refine our (re)search. This approach allows for a cognitive enrichment going well beyond the immediate
performance. For instance, when we visit links offered to us by a site we are visiting, and then continue our navigation on sites that have been marked as congenial, we create {every time} a unique trajectory; maybe we’ll even resort to bookmarking them. Such a procedure is {starkly}  at variance with a coherent user-algorithm-result sequence, but it does create a rich path full of sidelines, of branches, of {cognitive} jumps and winding detours, all catering to a non-linear cognitive desire [*N5].

To conclude, search engines are perfect tools for fulfilling the quantitative aspects of a (re)search taking place within an already fully structured resource pool, such as are lexicons, cyclopedias, etc. {Here,}The quantity is directly in proportion to the accumulation and computing potential: Google’s reach obviously dwarfs that of all its competitors, but in order to retain its position, Google needs to constantly expand in terms of algorithms, machines, users, etc.

Conversely, quality needs not necessarily to reside with technological prowess or economic might. Nobody {in her right mind} believes that the results returned correspond to the full spectrum of available information: the emergence of the best possible path cannot be foreseen, cannot be computed, but can only be arrived at step by step.

Under the veil of the myth

 The positioning values of Google’s ranking do not correspond to any clear evaluation criterion: yet, in the majority of cases results returned are [look?] exhaustive, that is, we can in no way tell whether something has escaped the spider, unless one is an expert in the issue at stake and knows a resource that has not been indexed {by Google}. The capillary distribution of its search tools has made Google a ‘de
facto’ standard. The white space (‘blank box’) where we type the keywords of our (re)search functions for the user as ‘Weltanschaung’ of sorts, promoting a very particular world-view, that of the idea of ‘total
service’: the search engine will answer any question, and will satisfy all requests made in the realm of the Internet.

Epistemologically speaking, the ‘blank box’ represents a cognitive model of the organisation of knowledge: We request {through} the white space an answer to all the search intentions we have put forward: indifferent whether  we wanted documents, or further information, or data, or that we
simply wanted to ‘navigate’. The (re)search activity becomes completely merged with the entity that provides the service, Google, [of which we have an invading perception (?)].

The habit of using this tool becomes ingrained behaviour, a repetitive activity: it becomes very difficult for users to imagine a different way to satisfy their need for ‘input’. They have become tied up to the
reassuring efficiency/ efficaciousness of the ‘blank box’.

To be active on the Web, and hence to need access interfaces and tools for unearthing information and setting out paths is a is a profoundly contextual and diversified occupation. (Re)search is everything but
homogenous and cannot be reduced to the use of the ‘blank box’. What we request and what we desire does not solely stem from a desire that can be expressed in the analytical terms of quantitative information, but is something that also hinges upon the way we approach (re)search, the context in which we undertake that (re)search, our own cultural background and {last but not least} on our aptitude to confront novelty, explore new territories, and face diversity {in general}. It is impossible to satisfy
the quest for information through a one size fits all solution.

Since the indexation of {web}pages is {by definition only} incomplete, in the sense that it is a selection obtained through {the} ranking {system}, what Google does offer us is the prosaic possibility to ‘encounter ‘something’  we might find interesting and/ or useful in its overflowing amount of data in its collection of subjects [issues]. A (re)search intention, however, implies a desire to find, or even to discover, “everything what one  doesn’t know but that is possible to learn about”.

The {‘good’} giant then appears for what he is: enormous, extended, branching out, but not necessarily adapted to our (re)search purposes.

(Re)search Models.

The ambiguity entertained by search engines, wanting us to ‘search in {their) infinite environment’ rather that in a closed, localised world that conforms to our (re)search intentions, comes from the formal
superimposition of two {distinct} levels, that of the interface [*N6] and that of the organisation. The interface, in this particular context is the technological element  through one accesses the information and the search gets executed; the organisation, on the other hand,  is the architecture, the technological model through which information is archived and disposed. The two levels {obviously} influence each other: organisation-realated choices prescribe the use of specific interfaces, while the information that are visualised through these interfaces betray in their form[at?] the way they are archived. [?]

The problem with this superimposition is that such information is presented in the form of identifiable and unambiguous, single  data. The user of Google moves in a linear fashion through the results list of the
ranking; in order to move from one result to the next she needs to go back to the start list, with no cross-over linkages possible at the level of the interface [?]

With search engines, one retrieves information, but without any consideration being given to path that have been followed {to obtain it}. The interface which directs our interactions is the ‘blank box’ where our
queries are inserted: at this first level of access, all information are on the same plane [have the same rank(ing) ?] They are homogenous, yet at the same time separate and fragmented in order to allow the listing of the results as they have been arranged in order of pertinence by the algorithm.

However, as far as the (re)searches one does on daily basis are concerned, the same results can be linked together in all sorts of ways, and it is not necessary to arrive at the same ordered arrangement every time, and neither does only a single ‘correct’ result obtain; on the contrary, a (re)search which is not about data structured like in a cyclopedia or a dictionary or any other object of that kind (and that may also change {in nature} over time), could well remain without an immediate answer, but would on the contrary require an effort of creativity, of ‘mixage’, and of recombination.

When a formal identity is being imposed between the level of the interface and that of the organisation,  the outcome is {by necessity} a constraining model. In Google’s case, {as} we have to do with what is
perceived as an infinite power of search, the means to arrive at a result are being substituted for the (re)search activity itself.

Let’s take an example: … [the example taken is a French word, 'plume', whose English equivalent ('feather') would not yield the same illustrative power. Briefly, the authors argue that if you 'Google' for that word, the first returns (out of 6.700.000 !)  will be about everything ( various IT companies, a circus, etc.) but birds-feathers or ink pens (also 'plume' in French) - I'll need to sort out a nice equivalent with the collective (or 'invent' one myself) - but maybe _you_ have an idea?] …  A more extended perspective of what it means to ‘discover’ information, and that would take the cognitive potential underlying every
information resource pool into account in a critical manner, would tend to see the access-search function as a process of exploration and creation rather than as one of localisation. The emphasis would then shift from epistemology towards ontology: it is non longer sufficient to know the information, but to become aware of our true role as creators of information [N*7]. Search engines that operate at the access level are therefore of no use for exploration, as they merely intervene on the first {and basic} level of the presentation of information.

Browsing is the moment of true dynamism in the linking together of digital objects, which are then able to express to the highest degree their heuristic and communicative potential. This is something that is learnt through experience, and it mutates as we are learning it, during the very activity of exploring.

There is a major difference between searching and finding. Google makes us ‘find’ things, causing the satisfaction that goes with the feeling of accumulation. But far more interesting that ‘finding’ is the search
itself. And maybe it would be even more rewarding to find, but not completely, because that would mean that we are sill engaged in the act of (re)searching.

A search engine is an instrumental model that arranges information into a certain order. It would be more useful and also more commendable to imagine models that (re)combine information, and {so} generate knowledge.

END of Chapter 6

Share

The Dark Side of Google: Chapter 6. Quality, Quantity, Relation

No Comments »

NB this book and translation are published under Creative Commons license 2.0 (Attribution, Non Commercial, Share Alike).
Commercial distribution requires the authorisation of the copyright
holders: Ippolita Collective and Feltrinelli Editore, Milano (.it)

Translation: Patrice Riemens

The Rise of Information

The information society is heterogeneous in the extreme: it uses network communication systems like telephony, digitalised versions of broadcast [*N1], pre-Web traditional media, like dailies, radio or television,  and Internet-born ones like e-mail or P2P exchange platforms, all this with gay abandon, and even without an afterthought. But a closer look reveals that all these systems are based on one single resource: information. Now within the specific domain of search engines, and thus of information retrieval, one can state that what consists information is represented by the sum total of all extant web pages [*N2].

The quantitative and qualitative growth of all these pages and of their content has been inordinate and continue to be so. That comes from the fact that it has become so {unbelievably} easy today to put up content on the Web. But contents are not isolated islands, they take shape within a multiplicity of relationships and links that bind together web pages, websites, issues, documents, and finally the contents themselves.

Direct and unmediated access to this mass of information is well-nigh impossible, even as a simple play of thought: it would be like ‘to browse through the web manually’. This is the reason why there are search engines, to filters the Web’s complexity and to serve as interface between the information and ourselves, by giving us search results we are happy with.

In the preceding chapters, we have reviewed the principal working tools of a search engine, that is the instruments Google, and other search {companies}, have put in place to scan through web pages, to analyse and order them with the help of ranking algorithm, to archive them on appropriate hardware supports, and finally to return a result to the users according to their search queries.

The quantity of stored web pages in memory is thus crucial to estimate the technical and economic potency of a search engine. The larger its ‘capital’ of checkable web pages, the higher a search engine will score on fiability and completeness of its returns, {but} this obviously within the limits of the specified context.

Yet, however enormous a search engine’s  ‘pages capital’ may be, it will, and could, never be entirely complete and exhaustive, and no amount of time, money or technology invested in it could change that. It is absurd to think that it would be possible to know, or, at a more down-to-earth level, simply to copy and catalogue all the Internet. It would be like the pretense to know the totality of the living world, including its constant mutations.

The information storage devices used by search engines like Google are like vessels: let’s imagine we’d have to fill an enormous vessel with diminutive droplets (think all the pages who constitute the Web’s information). Assuming that our vessel is able to contain them all, then our task would be to capture and identify them all, one by one, in a systematic and repetitive manner.

But if on the other hand, we’d think there are more droplets then our vessel can contain, or that we cannot fathom an algorithm to capture them all, or that the capture may be possible but will be slow, or even that the whole task may be hopelessly … endless, then we would need to switch our tactics. Especially as our data-droplets change with time, pages get modified, and resources are jumping from one address to another…

At this stage, we might decide to go only for the larger droplets, or to concentrate our efforts on those place where most droplets fall, or we could chose to collect only those droplets that interest us most, and then try to link them together in the way we think is the most relevant.

As search engines {companies} continue to go after the {holy grail of} cataloguing ‘everything’ {on the Net}, it might be better to take a more localised approach to the Web, or to accept that for any given ‘search intention’, there may well be many answers possible, and that among all these answers some may be ‘better’, because they conform to specific demands regarding [either?] speed [or?] and completeness. One should always keep in mind that the quality of results is dependent upon our subjective perception when it comes to being satisfied with a search return. And in order to accept or to reject a search return, it is essential to apply our critical faculties and to be conscious of the subjectivity of one’s viewpoint. In order to establish the trajectory one is really interested in, it is necessary to assume the existence of a closed and delimited  network, a kind of world that is bounded only by our own personal requirements, yet always realising that this concerns a subjective localisation, which is neither absolute nor constant in time. [I am not completely happy with this, but then the French text... etc]

>From an analytical point of view, charting a network means being able >to partition the network for examination into sub-networks, which amounts to creating little localised and temporary worlds (LCWs Localised Closed Worlds) each containing at least one answer to the search that has been launched. Without that many searches would go on with no end in sight, especially since the {amount of} data to be analysed go well beyond the ability of a human person to capture them all: hence this would be a non-starter. Conversely, altering and specifying the query, and refining one’s vantage point, will generate a trajectory that is more concordant with the departure point [of the search?]. By looking at the Web as a closed and localised world we also accept that the very dynamic of birth, growth and networked distribution of information ({even} happening while this information may already have become invalid) is an ‘emergence’ phenomenon, which is neither fortuitous, nor with[out?] a cause.

Emergence [*N3] is a occurrence which can be described in mathematical terms as an unexpected and imprevisible outburst of complexity. But it is foremost an event that generates situations which cannot be exhaustively described. To analyse and navigate an ‘emerging universe’  like the Web demands a permanent repositioning of oneself. This not only determines a ‘closed and localised world’ of abilities and expectations, but also the opening up towards new avenues of exploration (other worlds are always possible, outside one’s own closed one), and thus the appreciation that results can only and always be {fragmented and} incomplete.
Quantity and quality

Indexation by way of pages accumulation is a quantitative phenomenon, but does not in itself determine the quality of information on the Web; there the prime objective is to capture all pages, not to make a selection. The relationships between the pages give rise to emergence because they are generated on basis of a simple criterion, links existing between them. The quality of information springs hence forth from their typology, and is determined by their ability to trace trajectories, without bothering about a need to capture ‘all’ information available [?]. Quality therefore depends mostly on making a vantage point explicit through a particular search trajectory: basically, it are the surfers, the pirates, the users of the web who determine, {but also} increase the quality of information by establishing links between pages. The power of accumulation of Google’s algorithms is useful to achieve this, but is insufficient in itself.

The evaluation of the pages’ content has been outsourced to algorithms, or rather to the companies controlling them. The {whole} Google phenomenon is caused by our habit to trust an entity with apparently unlimited power that is able to offer us the opportunity to find ‘something’ interesting and useful within its own resource ‘capital’, which itself is being peddled as ‘the whole Web’. However, the limits of this {allegedly} miraculous offer are occulted: no word about was not in that ‘capital’, or only in part, and especially not about what has been excised from it.

The thorny ethical and political problem attenant to the management and control of information still refuses to go away: who is there to guarantee the trustworthiness of an enterprise whose prime motive is profit, however ‘good’ it may be?

Even though  considerable economic resources and an outstanding technological infrastructure are put to the task of constantly improving the storage and retrieval of data, the political question that constitutes the accumulation of data {by one single actor} cannot and should not be sidestepped. Google represents an unheard of concentration of private data, a source of  immense power, which is yet devoid of any transparency. It is obvious that no privacy law can {address and} remedy this situation, and that it would be even less the case through the creation of ad hoc national or international instances /towards the control of personal and sensitive data/. The answer /to the issue of confidentiality of data/ can only reside with a larger awareness and taking responsibility by the individuals who create the Web {as it is}, and this through {a process of} self-information. Even if this is no easy road, it is the only one likely to be worth pursuing in the end.

(to be continued)

Share