Scaling eCGA Model Building via Data-Intensive Computing

I just uploaded the technical report of the paper we put together for CEC 2010 on how we can scale up eCGA using a MapReduce approach. The paper, besides exploring the Hadoop implementation, it also presents some very compelling results obtained with MongoDB (a document based store able to perform parallel MapReduce tasks via sharding). […]

Related posts:

  1. Scaling Genetic Algorithms using MapReduce
  2. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre
  3. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre

I just uploaded the technical report of the paper we put together for CEC 2010 on how we can scale up eCGA using a MapReduce approach. The paper, besides exploring the Hadoop implementation, it also presents some very compelling results obtained with MongoDB (a document based store able to perform parallel MapReduce tasks via sharding). The paper is available as PDF and PS.

Abstract:
This paper shows how the extended compact genetic algorithm can be scaled using data-intensive computing techniques such as MapReduce. Two different frameworks (Hadoop and MongoDB) are used to deploy MapReduce implementations of the compact and extended com- pact genetic algorithms. Results show that both are good choices to deal with large-scale problems as they can scale with the number of commodity machines, as opposed to previous ef- forts with other techniques that either required specialized high-performance hardware or shared memory environments.

Related posts:

  1. Scaling Genetic Algorithms using MapReduce
  2. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre
  3. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre

Soaring the Clouds with Meandre

You may find the slide deck and the abstract for the presentation we delivered today at the “Data-Intensive Research: how should we improve our ability to use data” workshop in Edinburgh. Abstract This talk will focus a highly scalable data intensive infrastructure being developed at the National Center for Supercomputing Application (NCSA) at the University […]

Related posts:

  1. Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
  2. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre
  3. [BDCSG2008] Clouds and ManyCores: The Revolution (Dan Reed)

You may find the slide deck and the abstract for the presentation we delivered today at the “Data-Intensive Research: how should we improve our ability to use data” workshop in Edinburgh.

Abstract

This talk will focus a highly scalable data intensive infrastructure being developed at the National Center for Supercomputing Application (NCSA) at the University of Illinois and will introduce current research efforts to tackle the challenges presented by big-data. Research efforts include exploring potential ways of integration between cloud computing concepts—such as Hadoop or Meandre—and traditional HPC technologies and assets. These architecture models contrast significantly, but can be leveraged by building cloud conduits that connect these resources to provide even greater flexibility and scalability on demand. Orchestrating the physical computational environment requires innovative and sophisticated software infrastructure that can transparently take advantage of the functional features and to negotiate the constraints imposed by this diversity of computational resources. Research conducted during the development of the Meandre infrastructure has lead to the production of an agile conductor able to leverage the particular advantages in the physical diversity. It can also be implemented as services and/or in the context of another application benefitting from it reusability, flexibility, and high-scalability. Some example applications and an introduction to the data intensive infrastructure architecture will be presented to provide an overview of the diverse scope of Meandre usages. Finally, a case will be presented showing how software developers and system designers can easily transition to these new paradigms to address the primary data-deluge challenges and to soar to new heights with extreme application scalability using cloud computing concepts.

Related posts:

  1. Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
  2. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre
  3. [BDCSG2008] Clouds and ManyCores: The Revolution (Dan Reed)

Scaling Genetic Algorithms using MapReduce

Below you may find the abstract to and the link to the technical report of the paper entitled “Scaling Genetic Algorithms using MapReduce” that will be presented at the Ninth International Conference on Intelligent Systems Design and Applications (ISDA) 2009 by Verma, A., Llorà, X., Campbell, R.H., Goldberg, D.E. next month. Abstract:Genetic algorithms(GAs) are increasingly […]

Related posts:

  1. Scaling eCGA Model Building via Data-Intensive Computing
  2. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre
  3. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre

Below you may find the abstract to and the link to the technical report of the paper entitled “Scaling Genetic Algorithms using MapReduce” that will be presented at the Ninth International Conference on Intelligent Systems Design and Applications (ISDA) 2009 by Verma, A., Llorà, X., Campbell, R.H., Goldberg, D.E. next month.

Abstract:Genetic algorithms(GAs) are increasingly being applied to large scale problems. The traditional MPI-based parallel GAs do not scale very well. MapReduce is a powerful abstraction developed by Google for making scalable and fault tolerant applications. In this paper, we mould genetic algorithms into the the MapReduce model. We describe the algorithm design and implementation of GAs on Hadoop, the open source implementation of MapReduce. Our experiments demonstrate the convergence and scalability upto 105 variable problems. Adding more resources would enable us to solve even larger problems without any changes in the algorithms and implementation.

The draft of the paper can be downloaded as IlliGAL TR. No. 2009007. For more information see the IlliGAL technical reports web site.

Related posts:

  1. Scaling eCGA Model Building via Data-Intensive Computing
  2. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre
  3. Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study using Meandre