News 2009
April, 2009, Augustus version 0.3.1 was released.
Augustus is an open source platform for estimating and deploying
multiple predictive models that was developed by Open Data.
Version 0.3.1 was just released on Source Forge.
January, 2009, MalStone.
Open Data Group worked together with other
members of the
Open Cloud Consortium to help develop a cloud computing
benchmark for data intensive computing called MalStone.
Recent talks on cloud computing.
Rober Grossman has given several recent talks on cloud computing, including:
-
An Overview of the Open Cloud Consortium,
Cloud Computing Interoperability Workshop,
in conjunction with the Object Management Group (OMG) Technical Meeting,
Crystal City, Virginia, March 16, 2009.
-
Extending Analytics to Clouds,
8th Annual ON*VECTOR International Photonics Workshop, La Jolla,
California, February 24, 2009.
- Cloud Computing: The Importance of the Data Center for Science,
AAAS Annual Meeting, Chicago, Illinois, February 16, 2009.
News 2008
Sector/Sphere and the Open
Cloud Testbed win the SC 08 Bandwidth Challenge.
The NCDM and the Open Cloud
Consortium entry that highlighted several applications developed with
the Sector storage cloud and Sphere compute cloud and running on the
the Open Cloud Testbed won the SC 08 Bandwidth Challenge. Open Data
Group is a member of the Open Cloud Consortium and developed the
benchmarks and performed interoperability studies that was part of
the winning entry. SC 08 took place in Austin this year
during November 15-21.
Augustus version 0.3.0 was released.
Augustus is an open source platform for estimating and deploying
multiple predictive models that was developed by Open Data.
Version 0.3.0 was just released on Source Forge.
Robert Grossman gave several talks on cloud computing:
- A talk at Cloud Computing at its Applications CCA 08 in
Chicago on Oct 23, 2008 that looked at cloud computing from
a viewpoint in which the data center is the unit of computation.
- A Plenary Talk on cloud computing
the UK e-Science All Hands Meeting 2008 in
Edinburgh, UK on September 9, 2008.
- A talk on August 26, 2008 at KDD 08 in Las Vegas comparing Sector/Sphere
to Hadoop. Sector is about twice as fast.
- A Keynote Talk on cloud computing at the 2008 IEEE Congress on Services
(Services 2008) on July 10, 2008 in Hawaii.
Open Cloud Consortium. Open Data Group is participating
in the newly formed Open Cloud Consortium (OCC). The purpose of the
OCC is to support the advancement of research in cloud computing, to
develop open standards in cloud computing, to develop open source
software for cloud computing, to manage testbeds for cloud computing,
to run meetings, workshops and other events related to cloud
computing, and in general to advance the state of the art in cloud
computing. Robert Grossman is the initial Chair of the OCC.
Augustus Version 0.2.6.6 was released on Source
Forge. Augustus is an open source infrastructure for building and
deploying data mining and statistical models for large data sets and
high volume data streams. Augustus is compliant with the Predictive
Model Markup Language.
News 2007
Award. Robert Grossman led the The Angle Project,
which won First Place in the 2007 Analytics Challenge at the ACM/IEEE
International Conference for High Performance Computing and
Communications 2007 (SC07).
The title of the project was "Angle: Detecting Anomalies and Emergent
Behavior from Distributed Data in Near Real Time."
SIGKDD Award: Robert Grossman was awarded the ACM Special Interest Group
on Knowledge Discovery and Data Mining (SIGKDD)
Service Award for his "role in the development of open and
scalable architectures and standards for the SIGKDD and Global KDD
Communities."
Best Paper Award: The paper "Data Quality Models for High
Volume Transaction Streams: A Case Study" by Joesph Bugajski, Robert
Grossman, Chris Curry, David Locke and Steve Vejcik won the second
annual Data Mining Practice Prize at KDD 2007. The prize is awarded
each year "for work that has had a significant and quantitative impact
in the application in which it was applied."
Augustus Version 0.2.6.5 was released on Source Forge.
Augustus is an open source infrastructure for building and deploying
data mining and statistical models for large data sets and high volume
data streams. Augustus is compliant with the Predictive Model Markup
Language.
DM-SSP 07 Workshop. Robert Grossman organized the Workshop
on Data Mining Standards, Services and Platforms (DM-SSP 07), at
KDD-2007 in San Jose on August 12, 2007. The workshop highlighted
recent progress on developing standard-based services for data mining
and data intensive computing. This year's focus was on cloud
computing.
PMML Version 3.2. The Predictive Model Markup Language
(PMML), Version 3.2 was released in May, 2007. Open Data participated
in the development of this standard.
New Methodology Introduced. A very practical mechanism for improving
predictive analytics as the amount of data increases, is to build an
analytic infrastructure that builds automatically many predictive
instead of the more traditional approach that builds one (or a few)
manually. Robert Grossman gave a
lecture on
this recently: Modeling Highly Large, Heterogeneous Data Sets: Towards
a Billion Models, DIMACS Workshop on Recent Advances in Mathematics
and Information Sciences for Analysis and Understanding of Massive and
Diverse Sources of Data, Rutgers University, New Brunswick, May 15,
2007. How this idea was applied to analyze transactional data from
Visa is described in two papers at
KDD 2007: Robert Grossman, Joseph
Bugajski, Chris Curry, David Locke, and Steve Vejcik, Detecting
Changes in Large Data Sets of Payments Cards Data: A Case Study, and
Joseph Bugajski, Chris Curry, Robert Grossman, David Locke, Steve
Vejcik, Data Quality Models for High Volume Transaction Streams at the
KDD Data Mining Case Studies Workshop.
News 2006
DM-SSP 06 Workshop. Robert Grossman organized the Workshop on
Data Mining Standards, Services and Platforms (DM-SSP 06), at KDD-2006
in Philadelphia on August 20, 2006. The workshop highlighted recent
progress on developing standard-based services for data mining and
data intensive computing.
Augustus Version 0.2.4 was released on Source Forge.
Augustus is an open source infrastructure for building and deploying
data mining and statistical models for large data sets and high volume
data streams.
News 2005
Augustus Version 0.2.1 was released on Source Forge. Augustus is
an open source infrastructure for building and deploying data mining
and statistical models for large data sets and high volume data
streams. Augustus is compliant with the Predictive Model Markup
Language (PMML). Augustus supports vectorized operations and is
designed for data sets that are too large for existing open source
data mining systems.
This can be downloaded from Source Forge:
www.sourceforge.net/projects/augustus.
PMML Version 3.1. The Predictive Model Markup Language
(PMML), Version 3.2 was released in December, 2005. Open Data
participated in the development of this standard.
Industry related. Robert Grossman was elected to the six
member executive board for the ACM Special Interest Group on Knowledge
Discovery in Data (ACM SIGKDD) for the term 2005-2009.
Industry related. Robert Grossman was the general chair of
KDD-2005, The Eleventh ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining that
took place on August 21-24, 2005 in Chicago.