Meeting on April 20, 2009
Agenda
- Attendance
- Announce agenda
- Approve April 6, 2009 Meeting Notes. Ask for review and address any problems seen in the notes.
- Status of 4.3.0
- Discuss definition of external file formats and data type representations.
- Open Questions
- Summary and closing
Notes
Attendance
- tclarke
- kstreith
- rforehan
- raevans
- dsulgrov
- mconsidi
- dadkins
- goffena
Summary
The April 6, 2009 meeting notes where approved.
The 4.3.0 major features and release timeline were dicussed. See the log for details.
Opticks external file formats and protocols representations were discussed. Data types will have OIDs indicating their structure. If the structure changes, a new OID will be assigned. There are a couple of remaining issues.
- How should the OID tree be represented. Some are in favor of a flat tree and others a shallow tree. Most giving feedback liked the idea of a data type OID being a subtree with leaf OIDs representing the "version" of the data type. There was discussion about using OIDs for other defined items in Opticks such as extension IDs and message log step IDs.
- How should the data be serialized. Should it maintain the current hand coded XML? Should ASN.1 be used as well as or instead of hand coded XSD/XML. Many people seemed interested in exploring ASN.1 with BER and XER encodings instead of hand coded XML. There was concern about performance of an ASN.1 parser so this should be taken into account.
tclarke proposed keeping the current serialization for 4.3.0 and publishing the current serialization forms. ASN.1 would be explored as time permitted for possible inclusion in 4.3.1. In this case, the old XML would be maintained as is provided it remains compatible with the internal data model. (i.e. if the XSD version would need incrementing, it would be abandoned as deprecated)
tjohnson has written a prototype layer swiper implementation. More information will be posted to the mailing list at a later time. This is the swiper mentioned in the Google Summer of Code ideas in the porthole and swiper section.
Decisions
Logs
2009-04-20T11:02:22 <tclarke> ok, let's begin the Opticks meeting, we'll start with attendance
2009-04-20T11:02:22 <tclarke> here
2009-04-20T11:02:24 <kstreith> here
2009-04-20T11:02:25 <rforehan> Here
2009-04-20T11:02:25 <raevans> here
2009-04-20T11:02:26 <dsulgrov> here
2009-04-20T11:02:27 <mconsidi> here
2009-04-20T11:02:30 <|dadkins|> here
2009-04-20T11:02:34 <tclarke> today's agenda
2009-04-20T11:02:34 <tclarke> 1. Attendance
2009-04-20T11:02:34 <tclarke> 2. Announce agenda
2009-04-20T11:02:34 <tclarke> 3. Approve April 6, 2009 Meeting Notes. Ask for review and address any problems seen in the notes.
2009-04-20T11:02:34 <tclarke> 4. Status of 4.3.0
2009-04-20T11:02:34 <tclarke> 5. Discuss definition of external file formats and data type representations.
2009-04-20T11:02:35 <tclarke> 6. Open Questions
2009-04-20T11:02:36 <tclarke> 7. Summary and closing
2009-04-20T11:02:40 <tclarke> any problems with last meeting's notes?
2009-04-20T11:04:11 <tclarke> we'll hold open approval until the end of the meeting, if no problems are brought up, I'll consider the April 6 notes approved
2009-04-20T11:04:14 <tclarke> next item, status of 4.3.0
2009-04-20T11:04:16 <goffena> here
2009-04-20T11:04:22 <tclarke> this week is the last week to start new development for 4.3.0 so we have a good idea what will be included
2009-04-20T11:04:28 <tclarke> here's the roadmap page
2009-04-20T11:04:28 <tclarke> https://issues.ballforge.net/jira/browse/OPTICKS/fixforversion/10406
2009-04-20T11:04:41 <tclarke> of note: the large NITF support issue will be included
2009-04-20T11:04:46 <tclarke> dadkins, care to briefly summarize what this will add/fix?
2009-04-20T11:04:57 <|dadkins|> sure
2009-04-20T11:04:58 *** tjohnson has joined #opticks
2009-04-20T11:05:05 <tclarke> tjohnson, we are just summarizing major features which will be in 4.3.0
2009-04-20T11:06:36 <|dadkins|> one second...perusing old email
2009-04-20T11:06:41 <tclarke> while we wait, there will be 64-bit support for HDF4
2009-04-20T11:06:45 <tclarke> wizard builder improvements
2009-04-20T11:06:46 <tclarke> multi-file import
2009-04-20T11:06:49 <tclarke> upgrade to qt 4.5.0
2009-04-20T11:06:54 <|dadkins|> All:
2009-04-20T11:06:54 <|dadkins|>
2009-04-20T11:06:54 <|dadkins|> Added import (On Disk or On Disk Read Only) of large NITF files which exceed the available memory in the system. Previously, the system would crash. The new version is more forgiving in low memory situations and (at worst) will give a reasonable error message if the file cannot be imported.
2009-04-20T11:06:54 <|dadkins|>
2009-04-20T11:06:54 <|dadkins|>
2009-04-20T11:06:55 <|dadkins|>
2009-04-20T11:06:55 <|dadkins|> Win64, Solaris:
2009-04-20T11:06:56 <|dadkins|>
2009-04-20T11:06:57 <|dadkins|> Added import of data sets larger than 4 GB (up to 10 GB, the maximum NITF image size).
2009-04-20T11:06:58 <|dadkins|>
2009-04-20T11:06:59 <|dadkins|> Added export of data sets larger than 2 GB (up to 10 GB, the maximum NITF image size).
2009-04-20T11:07:05 <tclarke> a new IDL scripting plug-in will be available
2009-04-20T11:07:08 <tclarke> animation bumpers
2009-04-20T11:07:13 <tclarke> some changes which will allow Opticks to run on Linux
2009-04-20T11:07:21 <tclarke> there are many other smaller bug fixes and changes but I think this covers the big ones
2009-04-20T11:08:41 <tclarke> most have been brought up on dev@opticks previously so there should not be any surprises
2009-04-20T11:08:49 <tclarke> any further questions or input on new features for 4.3.0?
2009-04-20T11:09:04 <tclarke> ok, let's move on to the next topic
2009-04-20T11:09:12 <tclarke> Opticks has a variety of data file formats and network protocols
2009-04-20T11:09:22 <tclarke> some of these are fairly well defined via an XSD and documentation (in the case of ICE)
2009-04-20T11:09:24 <tclarke> but some aspects are not well defined
2009-04-20T11:09:29 <tclarke> specifically, serializations of certain data types such as enums
2009-04-20T11:10:56 <tclarke> mostly unstructured data types (structured have a well defined XML serialization, unstructured are generally just text)
2009-04-20T11:11:13 <tclarke> I send some emails out about this and we will be defining data types as OIDs which will never change once defined...if a type needs to be changes, a new OID is generated to supercede the previous one
2009-04-20T11:11:18 <tclarke> there are a couple of questions left which need answering
2009-04-20T11:11:22 <tclarke> first, I've registered a root OID for Opticks use
2009-04-20T11:11:29 <tclarke> and OID is a dotted series of number (01.03.12.05, etc.)
2009-04-20T11:11:34 <tclarke> which define a tree of registered OIDs
2009-04-20T11:11:41 <tclarke> we have a subtreee but need to define how that tree will look
2009-04-20T11:11:50 <tclarke> a possibility is to make it flat and assign a single number (leaf of Opticks) for each data type
2009-04-20T11:13:12 <tclarke> prefereably, we'de define a more complex tree structure
2009-04-20T11:13:15 <tclarke> thoughts on how that should look?
2009-04-20T11:13:24 <tclarke> perhaps a node for "Data Types" and sub nodes for "Simple" and "Structured"
2009-04-20T11:13:36 <tclarke> each data type could be a tree itself with sub-values defining the current "Version" of the data type
2009-04-20T11:13:42 <tclarke> ex: the ColorType data type could be 01.01, 01.02 for the second revision, etc.
2009-04-20T11:13:54 <kstreith> i'd lean towards flat
2009-04-20T11:14:01 <kstreith> and i like your suggestion of using part of the OID to represent version
2009-04-20T11:15:20 <tclarke> I think flat is a bad idea...it makes it hard for us to assign OIDs for other uses at a later time
2009-04-20T11:15:27 <kstreith> aren't they just number sequences?
2009-04-20T11:15:28 <tclarke> having sub-trees also allows us to have a tree for "Core" data types and assign trees for opticks extensions
2009-04-20T11:15:44 <tclarke> yes, but listing a bunch on a page with no structure can make it more difficult to track down the one you want
2009-04-20T11:15:46 <tclarke> it's like c++ namespaces
2009-04-20T11:15:57 <tclarke> you can define all your classes in the global namespace, but using namespaces helps prevent clashes and makes it easier to organize
2009-04-20T11:16:11 <tclarke> I don't think we need a really deep structure, perhaps a DataTypes sub-tree with another for core/extensions
2009-04-20T11:17:54 <tclarke> another use might be message log steps...we currently use a freeform string which is often a UUID but that could be a registered OID which would define what precisely that step does
2009-04-20T11:18:09 <tclarke> we can have a sub-tree for extensions which the extension UUIDs for known Opticks extensions
2009-04-20T11:18:12 <tclarke> making it easier to track down extension dependencies
2009-04-20T11:18:31 <tclarke> other thoughts?
2009-04-20T11:20:07 <tclarke> ok, the other question has to do with datatype serializations
2009-04-20T11:20:17 <tclarke> right now, most of our externally serializable data types have XML representations defined in an XSD
2009-04-20T11:20:23 <tclarke> first, I'd like to maintain these as-is for backward compatibility reasons
2009-04-20T11:20:39 <tclarke> now, should be continue to use these hand coded serializations exclusivly? if not, should they still be used at all? the option I'm
2009-04-20T11:20:46 <tclarke> leaning towards is to define them in ASN.1 which is an abstract notation which is more readable that XSD
2009-04-20T11:22:13 <tclarke> and can be used to serialize to BER which is a compressible binary serialization
2009-04-20T11:22:16 <tclarke> and XER which is an XML serialization
2009-04-20T11:22:22 <tclarke> the main difference between our hand coded serializations and XER is use of XML attributes
2009-04-20T11:22:32 <tclarke> XER encodes everything as elements instead of attributes and elements
2009-04-20T11:22:41 <tclarke> this would change our XML serializations (making them version "4") but maintain the same general hierarchical structure
2009-04-20T11:22:48 <tclarke> there are compilers which convert ASN.1 to DTD and XSD for use in xml parsers
2009-04-20T11:22:55 <tclarke> and they can also generate C code to parse/generate XER and BER from the ASN.1
2009-04-20T11:23:04 <tclarke> this means we could compile the ASN.1 to C code and ship it as a library which can be used by other apps
2009-04-20T11:24:20 <tclarke> or they can parse the XER using any XML parser
2009-04-20T11:24:20 <tclarke> thoughts?
2009-04-20T11:24:45 <kstreith> interesting....
2009-04-20T11:24:49 <kstreith> but maybe too aggressive if you're targeting for 4.3.0
2009-04-20T11:24:52 <tclarke> for 4.3.0 here's what I'd suggest
2009-04-20T11:26:34 <tclarke> defining the current non-hierarchal types as they currently stand (basically, publish TypeConveverter.cpp and StringUtilities.cpp) and assign OIDs to all the types
2009-04-20T11:26:43 <tclarke> write a page talking about ASN.1 and it's benefits
2009-04-20T11:26:55 <tclarke> that's it for 4.3.0....in the interim, begin converting the current data types to ASN.1
2009-04-20T11:27:10 <tclarke> when that's done, offer the ASN.1 based serializations as the default....XER for normal exporters with a BER options
2009-04-20T11:27:15 <tclarke> and BER for "embedded" exporters such as the convertions used in ICE
2009-04-20T11:27:21 <rforehan> Does that mean eventually getting rid of the xsd's?
2009-04-20T11:27:22 <tclarke> continue to load the old XML as long as it is fesible
2009-04-20T11:27:29 <tclarke> rforehan: yes, sort of....we can still generate them but they would no longer be hand coded
2009-04-20T11:28:57 <tclarke> feasibile would mean as long as the internal class structure is compatible with the old XML....if there are major changes to a data type, the old XML would no longer be loade
2009-04-20T11:29:35 <tclarke> ok, if there's no additional discussion, let's move on to open questions
2009-04-20T11:29:42 <rforehan> So in essence, we would never have to worry about misspellings in attribute names since the from and to code would be computer generated from the ASN.1 definitions.
2009-04-20T11:31:09 <tjohnson> 4.3.0 schedule? Specifically, lockdown and release dates?
2009-04-20T11:31:10 <tclarke> as long as the ASN.1 was written corrently, yes
2009-04-20T11:31:18 <tclarke> 4.3.0 soft lockdown is this friday...meaning, no new features are started
2009-04-20T11:31:23 <tclarke> a week after that is hard lockdown
2009-04-20T11:31:35 <tclarke> the rollup will begin the following monday but will be an extended rollup to allow a couple of external developers to build against 4.3.0
2009-04-20T11:31:58 <tclarke> rc1 should be available on the web site May 19
2009-04-20T11:33:24 <tclarke> final delivery (after testing and 4.3.0 rollup) is June 11 I believe...I'm not 100% on that date tho
2009-04-20T11:33:43 <tclarke> rforehan: basically, the asn to C compiler creates C structs and support functions to save and load....we'd need to convert those C structs into our internal classes
2009-04-20T11:33:46 <tclarke> and vice versa
2009-04-20T11:33:49 <tclarke> the C functions
2009-04-20T11:33:52 <tclarke> perform validation while loading
2009-04-20T11:34:01 <tclarke> here's an open source compiler: http://lionet.info/asn1c/
2009-04-20T11:35:33 <tclarke> here's a commercial compiler which generates XSD (the above generates DTD) http://www.obj-sys.com/asn1-compiler.shtml
2009-04-20T11:35:41 <tclarke> tjohson: questions on the release schedule?
2009-04-20T11:35:56 <tjohnson> No. I just wanted to make sure I was up to date.
2009-04-20T11:36:06 <tclarke> also, tjohnson has prototyped a layer swiper as dicussed here: https://wiki.ballforge.net/confluence/display/opticksDev/Summer+of+Code+2009
2009-04-20T11:36:21 <tclarke> would you like to say anything about it? If not, I believe kstreith will post to the mailing list when he's had time to look it over
2009-04-20T11:38:15 <tclarke> if there are no more discussions, we can end the meeting, I'll give everyone a couple more minutes
2009-04-20T11:38:26 <goffena> "basically, the asn to C compiler creates C structs and support functions to save and load"
2009-04-20T11:39:55 <goffena> Is there any additional non-neglible performance overhead to doing this?
2009-04-20T11:40:01 <tclarke> versus what?
2009-04-20T11:40:09 <goffena> What is happening currently.
2009-04-20T11:40:20 <tclarke> probably will perform slightly better but it is difficult to say
2009-04-20T11:40:24 <tclarke> Xerces loads the XML and the XSD
2009-04-20T11:40:26 <tclarke> parses the XML to a DOM
2009-04-20T11:40:29 <tclarke> then validates the DOM against the XSD
2009-04-20T11:40:32 <tclarke> we then parse the DOM to C++ classes
2009-04-20T11:40:41 <tclarke> the ASN.1 version will contain the validation as static code structures
2009-04-20T11:40:43 <tclarke> and will load the XML into custom C structs
2009-04-20T11:40:47 <tclarke> we'd then convert the C structs to C++ classes
2009-04-20T11:42:16 <tclarke> since we don't know exactly what's happening in the two parsers, it's difficult to make assertions
2009-04-20T11:42:20 <goffena> "will load the XML into custom C structs" i guess is where i don't know what will happen performance wise
2009-04-20T11:42:24 <tclarke> but generally specialized parsers are faster and have less mem overhead than generic parsers
2009-04-20T11:42:26 <tclarke> but not always
2009-04-20T11:42:35 <goffena> I know when we save session upstairs we have datasets with many bands.
2009-04-20T11:42:36 <goffena> 1000s
2009-04-20T11:42:41 <tclarke> but Xerces is already doing something similar...it's load XML in a DOM which is representated as Xerces classes
2009-04-20T11:42:58 <tclarke> if session save uses the ASN.1 code, it could save to BER which is binary and generally loads faster than XML which is quite bloated
2009-04-20T11:44:35 <tclarke> in summary, I can't make any promises either way without actually implementing the ASN.1 solution and comparing it...but based on what I know, I expect it will be faster with ASN.1
2009-04-20T11:44:39 <tclarke> educated guess only at this point
2009-04-20T11:44:46 <goffena> Ok. I'll keep a look out for serialize of dataset bands when we see an update.
2009-04-20T11:44:48 <tclarke> how about this for a slightly modified schedule
2009-04-20T11:44:50 <goffena> understand
2009-04-20T11:44:55 <tclarke> using what's there now with documentation for 4.3.0 as I mentioned before
2009-04-20T11:45:08 <tclarke> starting to implement the ASN.1 and using it for session save/load initially...this can be easily tested and tweeked
2009-04-20T11:45:14 <tclarke> when something acceptible has been established, using it as the external formats
2009-04-20T11:46:43 <tclarke> I don't know an exact timeline for this but I'd shoot for 4.3.1 supporting the new formats
2009-04-20T11:46:59 <tclarke> we can also talk offline where I can more easily show you some examples
2009-04-20T11:47:33 <tclarke> ok, if we're done with discussions, lets summarize and close
2009-04-20T11:49:02 <tclarke> as usual, I'll post notes to the wiki and send an email with the link...next meeting is in two weeks...feel free to add to the agenda if you have something you'd like to discuss
2009-04-20T11:49:11 <tclarke> I'll send an email re: the discussion we just had to verify the timeline and immediate implementation
2009-04-20T11:49:15 <tclarke> thanks for coming