OK, I'll kick this topic off on our new forum. (thanks Chris)
Here a couple of things to consider when handling data for data sonification software (DSS)
- When is a flat file structure not enough?
- Having read it (.CSV's?) into memory, then it has to be parsed into
-Lists - Dictionaries - SQL dBs - Other external DB structure (b-trees ...
- Metataging - when to and when not to and what schemes?
- there might be an arg. for developing a common XML
- Interprocess and/or trans-network data flow logistics incl. between different sw apps.
- Access to statistical processing - how does each of us do it? External tools, integrated?
- Develop a set of tests/benchmark targets towards which we can design.
(From a suggestion by Greg and Bruce towards an conf. demo/workshop)
- What things need addressing in dataflow of complex models, such as Tom H's, to make them more widely available?
- What links can be made between TaDa-like DBs and actual data.
- Data cleaning what tools/experiences? Can they be shared?
- An example data repository to which contributions can be made. (Extension of 2004 EEG experiments.
- Buffering for repeated playback
- Streaming audio vs direct to HD
That's a bit of a core dump on some data topics,
Perhaps others would like to add their data-handling concerns/topics.
-drw (David Worrall)
kinds of data
Hiya Ben! (for some reason I could'nt reply to your reply - Chris is that me or a limit of the system... probably me....:-(
This is not a formal definition, but I think of datum as a symbolic representations of a (discrete) instance and a dataset, as a collection of 'em. Datum can 'carry' (implied or explicit) metadata. Compare 1 and 1.0 and 1.00, or 'a' and 'A' for example. In each case, I think of them as a separate datum. However 7 and 111 can be equivalent datum - when the representation in the first case is in base 10, and the second in base 2. Whether '111' is different or not depends on metadata context. i.e the symbol '111' might represent three fingers. So whether something is a datum or not depends on the intent of the observer.
How we handle data/datasets in DS will depend on what we're trying to do with it. Sometimes relational models are more appropriate that object representations - called the ORM dilemma in the DB programming world. Sometimes datum are represented in 16-bit binary format stored in a 'database' called an audio file. Strictly speaking, when we listen to the contents of an audiofile we are not listening to the data but listening for information through the data. That's how I understand it, anyway. You'd be correct if you surmised that I have difficulty wih the expression data mining but perhaps that's for another day :-)