Data Flow Systems: Algorithms and Complexity
INTAS 00-397 “Data Mining Technologies and Image Processing: Theory and Application”
INTAS 00-626 “Data mining algorithm incubator”
J. Castellanos and F. Mingo, Technical University of Madrid,
M. Mulvenna, University of Ulster, K. Vanhoof, Limburg University,
M. Hatzopoulos, University of Athens,
L. Aslanyan, Institute for Informatics and Automation Problems of Armenian National Academy of Sciences,
V. Ryazanov, Computer Center of Russian Aacademy of Sciences,
S. Ablameyko and A. Tuzikov, United Institute for Information Technologies of Belarusian Academy of Sciences,
G. Katona, A. Renyi Institute for Mathematics of Hungarian Academy of Sciences,
H.D. Gronau, Rostock University,
J. Vorachek, Lappeenranta University of Technology,
A. Ionita, Research Institute for Artificial Intelligence, Romanian Academy of Sciences,
Yu. Zhuravlev, Moscow State University,
H. Sahakyan, Information Society Technologies Center Armenia,
V. Gladun, V.M.Glushkov Institute of Cybernetics of Ukrainian Academy of Sciences
Data Flow Systems: Algorithms and Complexity (DFS-AC) - names the current research objective of a cluster of international RTD projects. It brings together not only research directions such as algorithms and complexities, but also machine intelligence and distributed systems, towards the integrated research of principles of novel information systems, – the so called framework of Hereditary Hybrid Societies (HHS). Algorithmic and complexity concepts are investigated regarding the novel issues of extremely high and extending size data amounts and data flows in networks. Mathematically - the main emphasis is in a novel combination of application oriented researches in three important areas - massive data sets, autonomous agents - living and communicating in a society, and complex systems. Technological component is in knowledge acquisition and accumulation through distributed and cooperating systems, in study of properties, approximating intelligence in nature and in human society.
An open online consultation forum has been recently started (), to allow submitting and improving ideas for new promising IST-related research areas which might become future FET proactive initiatives to be funded by the EU through future IST calls for proposals.
The two “new initiative” submissions, presented below, - were prepared by the INTAS 00-397 and INTAS 00-626 teams, and the INTAS-FET Strategic Workshop “Data Flow Systems: Algorithms and Complexity” aims at larger discussions of these issues in framework of the International Conference "i.TECH 2004".
Title of new initiative 62: Introspective Reasoning,
Research Objectives and Challenges:
Predicting the future is a difficult task. But that is the challenge that has been taken up by areas of artificial intelligence and in particular data mining. At this time, in 2004, we have had relative success in well-defined areas; for example, online recommender systems and discrete scientific domains. But the predictive systems developed today are brittle in that they cannot deal with major changes in data sources or trends across time and other dimensions.
Set against this landscape, our world is becoming increasingly a digital one. The amount of data collected is increasing as sensor technology evolves, and as more people use the Internet and its attendant services.
Consequently we need to address the challenge of building flexible reasoning systems. Such systems must be open source, non-brittle reasoning systems that adapt to new environments and their data, that detect changes and trends, and that can measure their own performance.
This requires research that addresses several interconnected areas:
1) Concept drift. Detecting drift in data sources.
2) Ensemble reasoning/recommendation. Bringing together different predictive models. Based on distributed, disparate data sources or indeed knowledge base. Using different mediation schemes between the recommendation and predictive agents.
3) Introspection. Critical appraisal of the predictions/recommendations to improve knowledge structures.
4) Agent-based computing and web services. Required to facilitate distributed, heterogeneous systems.
5) Semantic-based technologies. Ontologies.
6) Representation. Using and extending PMML (www.pmml.org), RDF, OWL, DAML+OIL
7) Extending Open Standards such as data mining APIs like JSR-73 (http)
The constituent research areas of Introspective Reasoning cover a broad spectrum of computer science, but critically combine together several strands of research that are novel and high risk. Importantly the introspective capability provides a means of objectively measuring the performance of implemented systems. For example for online recommenders, the success of personalisation can be measured, and serve as an indicator of user acceptability to online services.
The European Commission funds many RTD projects that have a component which is data mining, recommendation, personalisation, etc. Much of this research funding is not being used effectively. What is required is a research drive towards standards-based, open source, extensible predictive software that offers measurable performance and non-brittle adaptability.
Past projects have also tried to build a theory of what predictive techniques are more relevant under what circumstances. It is now accepted that such automated techniques for choosing the most appropriate predictive model is not achievable and more context sensitive approaches to learn dynamically within an ensemble (circumstances when one model is more appropriate than others) is required. It is timely that many of the strands of research required for introspective reasoning are maturing (e.g., agent technology and ensemble reasoning) or offer significant promise (concept drift, ensemble recommendation). There is a synergistic potential across the research centres in Europe to push forward in Introspective Reasoning, especially as much of the global expertise and knowledge base for this research is based in our continent. It is not possible for single research organisations in Europe to address such a challenge on their own, which is why support and stake-holding from the European Commission is required.
The focus of IST in FP6 is on “the future generation of technologies, in which computers and networks will be integrated into the everyday environment, rendering accessible a multitude of services and applications through easy-to-use human interfaces. This vision of ambient intelligence places the user, the individual, at the centre of future developments for an inclusive knowledge-based society for all.”
We are in danger of becoming a data-rich, knowledge-poor society. If the trends continue, then key objectives of the Information Society will be jeopardised. Social exclusion to e-services will be the norm, as the percentage of digitally disadvantaged increases as Europe’s median age increases. What is required to power the visions of ‘ambient intelligence’ and ‘easy to use human interfaces’ is standards-based, open source, extensible predictive software that offers measurable performance and non-brittle adaptability.
The potential impact on Europe at societal, economic and business level is significant. An increasing amount of our disposable income is now being used for online services, while essential public services (from government to health to education etc) are moving towards new modes of support and delivery. This will bring about a change in business models, with attendant societal impact. Providing Introspective Reasoning adds value to the social provision model, and minimises the potential for an increase in the digital divide.
Giving Europe a lead in this area will lead to an increase in spin-off and joint ventures generating fast-moving businesses that help support the coming transition in e-services in Europe.
Communities addressed, other related initiatives
AI, Cognitive science, behavioural psychology, agent based research, data mining and predictive modelling, semantic web, Grid.
Mr Maurice Mulvenna from University of Ulster
Title of new initiative 69: Hereditary Hybrid
Research Objectives and Challenges
HHS research area is originally aimed to study the highest intellectual properties of advanced information ambience in similarity with the behavior of nature and societies. The main idea presented to develop and implement appropriate “modeling” and “algorithmic” solutions for the globally distributed autonomous activities, huge data amounts and data flows dynamically increasing in size, knowledge communication and accumulation. It is challenging to gain the IT Inheritance (accumulated transfer of knowledge, properties, etc.), originally a sophisticated property of nature.
Today’s paradigm is that Internet is breaking the traditional ways of thinking. It deals with heterogeneous dynamic resources through interaction of mobile and intelligent agents (both machine and human) creating a hybrid society with more complex sophisticated properties, approaching inheritance. A broad multidisciplinary research of behavioral knowledge models and novel mathematics is necessary to create hybrid societies, which is the internals of advancing AmI.
Future IT systems are getting more and more science embedded and user centered. Apart from specific businesses and scientific computations, systems have to provide more and more intelligence. An analysis of requirements revealed the lack of base research knowledge in information regulation principles of individual and social behavior, data flow algorithms, hereditary IT design/solutions. The point is to develop and gain technologies, which, being in line with the determined ambitious perspectives of 6th FP will also become the key issue of future IT businesses. Experimentation with technologies under consideration is possible on base of knowledge available today: algorithms, adaptive and developing systems. This will produce approximate solutions. The effectiveness of such systems requires researches of specific type, which, in future, will make a background for the targeted next generation solutions – creative systems living and interacting both with other similar environments and with humans in information ambience. Being sophisticated, nonstandard and presented within a theoretical framework, this R&D is within the scope of FET activities.
Industry: the move from local to distributed heterogeneous management is based on data flows, knowledge acquisition and knowledge accumulation. Such human like information systems will support the businesses providing services with effectiveness and with AmI properties.
Economy or society: distributed IT services and integrative indicators are the base components. For dynamic and flow data of huge sizes, systems will work effectively when several new paradigms are addressed to. For complex systems, the regular approaches will face shortage of resources: time, memory, algorithms and communication.
Communities addressed, other related initiatives
Research communities like philosophers, mathematicians, IT, psychologists, theoretical biologists, behavioral scientists, etc. may participate in HHS research. In described area initiatives exist both on European and US levels, although the existing initiatives are emphasizing elements different from those in HHS problem area. HHS is not only about classification (pattern recognition), machine learning (artificial intelligence), and complexities (discrete mathematics), it just addresses possibilities and approaches to systems, acting on informational level like human beings and communicating and developing like nature and society. Based on traditional intelligence studies it is searching constructions able to solve the integrated intelligent information management issues and even beyond, taking into account the exceptional power of computing machinery.
Prof. Levon Aslanyan from Information Society Technologies Center, Armenia
 Research partially supported by INTAS, .
 NATO HTECH-LG-961170 "Distributed Information Systems for Critical Social Issues"; INTAS 96-952 "Concurrent heuristics in data analysis and prediction"; INTAS 00-397 "Data Mining Technologies and Image Processing: Theory and Application", and INTAS 00-626 "Data Mining Algorithm Incubator".
 Summary by John Casti and Ralph Dum, Brainstorming Meeting on Complex Systems, EC IST FET Unit and NOE EXYSTENCE, April 25-26, 2002, Brussels.
 Terascale and Petascale Computing: Digital Reality in the New Millennium, The Federal High-Performance Computing and Communications (HPCC) program (2003), The National Science Foundation, Arlington, Virginia 22230, USA.