Review page for Data Management - Import Export feature for NetBeans 7.0

This page is used to manage the review of Import Export Feature.

The value of this approach is that it allows me to track all the issues separately and to drive them to resolution. For me email reviews although nice and informal are almost impossible to track to conclusion. Inline comments to the main document muddy up the document and are also difficult to track to conclusion.

The review process works like this:

  • Add a comment header of the format "Comment <initials>-<number> (<status>)" (see below for examples)
  • A discussion then can take place under that comment header
  • Each comment will have a status at the top, with the status set to Approved, Accepted PDU (pending doc update) or Unresolved
  • If necessary, we do more rounds of discussion until the item is marked as Approved


Commenter Initials Key

DVC - David Van Couvering

AHI - Ahimanikya Satapathy

MB - Manish Bharani


DVC - Import From File Overall - Approved

Overall, this is very helpful, and looks really cool!

However, one general issue I have is that it is presented more like a demo than a full spec. There are places where fields, properties, valid values, and meanings of terms are not described completely. We really need this to be able to fully review, document and test this feature.

DVC-01 - Import file - Select source for import

It looks like you can select multiple sources for import. This is hard to get my head around. What does this look like? This isn't really discussed in the spec

MB : Spec has been updated with the appropriate explanation.

DVC: I read the spec, and it appears that you can select multiple sources, but that each source is treated as a new table, and the wizard is run for each one. I'm not sure about this mode. I think it is more confusing than helpful, because you don't even know what it means to select multiple sources and then only one source is used in the wizard; it's only after you've finished the wizard that you iterate again for the next source.

I would like to suggest that you can only pick one source, and keep this simpler. To make it easier, the Import action should remember the location you used for your previous import so the user doesn't have to navigate back to the same place again.

This topic may be worth discussing on the mailing list, involving the other members of the team and our HIE guy, Mike Radosti.

DVC-02 - Import file - Selecting table type - Approved

Please provide definitions for all table types - their names and what they mean (e.g. what format is expected when you choose a table type)

MB : Spec has been updates with relevant details

DVC-03 - Import file - Table type - Approved

I think this is a file type, not a table type, right?

MB : Files selected here for import will be represented as tables. May be we can name it more appropriately as "File Table Type". Intent here is to provide a name to the table being created from the file source selected.

DVC - If I get things correctly, what is happening is a virtual table is being built out of a file, using Axion, and then this virtual table is imported into your existing schema as a "real" table. But note that none of this matters that much to the user. From the user's perspective, I am importing a *file* into a *real table* - I shouldn't known nor have to know about "virtual tables." That concept shouldn't leak into the user interface - it is an implementation detail.

That said, from the latest shots, it looks like you are asking the user to specify a file type, so this looks good.

DVC-04 - Import file - Define Table MetaData - Approved

Please describe what *all* the properties are, what they mean, and legal values for the prototypes, including all the special characters such as {pipe}, {CR}, {LF}, etc.

MB : Spec has been updated with the detailed explanation.

DVC-05 - Import file - Preview Table Content - Approved

Please describe the meaning of the toolbar icon ("refresh"?), and what it means to limit rows. I have an idea, but we need a complete description.

MB : Spec has been updated. Actually the screen shot was old and it has been updated now to the latest which does not contain the components questioned in this comment. Refresh is embedded into it and "limit row" option has been replaced with "Page Size" option.

DVC-06 - Import file - Import to existing table? - Approved

It looks like you only support importing to a *new* table. What if you want to import to an existing table? A common use case would be getting a new version of sample data from a customer, or importing a new version of the data from the source (e.g. an updated listing of US representatives after the election results are in).

If you say that's not currently supported, that's fine, but I think this is a pretty important use case to cover as soon as we can.

MB : Spec has been extended to include this scenario. Section 6.5 holds this update - Approved

DVC - Great. I'll add specific comments below as new items. It would be nice if this were incorporated into the overall spec

DVC06.5 - Errors

What happens if there is an error reading the , because it is missing, invalid format, or whatever?

MB : As of now, user can set the tolerance level for file parsing errors by setting the "Max no of faults to tolerate" under Define Metadata wizard panel i.e. Section 5.x. This value is defaulted to 0. In case user anticipates some errors but wishes to go on with the import , this no can be set to fairly higher value. In such a case, import will ignore rows with errors and import only data rows that were successfully parsed.
Also, user can preview file to visually analyze any errors. As this may not be possible always esp with large files, we can think of having a file check option/button after user has entered the parsing info in metadata screen. This probably would help. Would like to get Ahi's recommendation on this ?

DVC When I worked with bulk import tools in the past, which have a similar user flow, we would let users choose to either "fail on error" or "continue on errorSetAlthough setting the number of failures is interesting, I think the more common use case is to say "fail" or "don't fail". I think you need a "don't fail" option, which isn't currently there, or it's not clear.

You don't describe what is done with the errors - are they written to an output tab (my preference)? Do the errors provide specific information about the failed row and what went wrong? Please describe.

With regard to "testing" the file before import, that sounds reasonable. I would add a button called "validate input source" and then errors can be written to an output tab. Maybe discuss further on the mailing list.

DVC-07 - Import file - Bulk import

Many vendors support bulk import APIs for faster processing of large data sets from files. Does your implementation make use of those APIs?

MB : Not very sure about APIs being mentioned. Could you give me a reference that would help understanding this further.

DVC - Oh, OK. Well, this is important to understand. DB vendors provide interfaces for bulk loading (and exporting) of large data files. Generally they follow a different data insertion path, where for example the index is not maintained during the bulk insert, records are inserted in larger batches rather than a record at a time, sometimes the log is not enabled. Using these facilities can have a very big impact on performance of bulk inserts (and exports).

For example, see the MySQL LOAD DATA INFILE command and the Apache Derby bulk import/export tools and the MS SQL Server bcp command and the Oracle import/export utility.

You may not be able to use all of these because some of them are command-line tools, but you should definitely do some research on them and decide what you want to do with this.

DVC-08 - Import from table - overall - Approved

This is very cool, this is a highly rated feature request.

MB : Thanks. I'm sure this would unleash immense power to database usage from NetBeans.

DVC Well, I'm not sure about "immense power" (images of the Incredible Hulk and Superman come to mind :)) but it's a very nice feature

DVC09 - Import from table - Import from multiple tables

The interface seems to imply you can import from multiple tables. What exactly does that mean?

MB : Many tables from variety of databases can be imported at once.

DVC - But what does that mean? Multiple data sources into a single table, or multiple data sources into multiple tables? From the import file stuff above, it looks like it's the latter. Again, I'd like to suggest that we only support one table at a time, rather than have a multiple-iterations wizard, which seems complex and confusing.

DVC10 - Import from table - Filtering - Approved

There appears to be no way to filter what you import, based on some kind of query. In general I would like to be able to important anything that can deliver a tuple set, be it a view, a table, or a query.

MB : This can be a valuable enhancement. I'll put this in the future wish list (Import/Export feature) to track and would request Ahi's comment on this.

DVC OK, what you are offering is definitely better than nothing, but we should get this in there.

DVC11 - Import from table - Views - Approved

There appears to be no way to import from a view.

MB : Currently this has not been enabled. This is being tracked as future enhancement here.

DVC OK, again, I think this is pretty important.

DVC12 - Import from table - Import into existing table - Approved

There appears to be no way to import into an existing table. This seems pretty important.

MB : Spec has been updated to include this feature.

DVC - OK, I will add comments below.

DVC13 - Import from table - Update an existing import - Approved

This may be the same as DVC12, but if you have imported into a new table, you might want to re-import from the same data source from time to time. There does not appear to be any support for this. And I don't want a virtual table, I want a real table that can get updated from time to time.

MB : Spec has been updated to include this feature.

DVC - OK, I will add comments below.

DVC14 - Import from table - Canceling

What if the import is taking too long. How do I cancel?

MB : Currently we do not have this option. I'll track it into the future requirements section.

DVC - Sorry, but it's a defect if you can't cancel. There is a standard approach for this, using the Progress APIs and making the task cancellable. This needs to be solved before we release.

DVC15 - Import from table - Failures

What happens if there is a failure during import. How is that handled?

MB : Pls can you specifically mention about failures that you perceive to be handled.

DVC - I would like you to think through the flow of the tool and determine what the failure scenarios are. I'll give you some examples

  • There is an error in the source data which causes import of a row to fail
  • The connection is lost during import
  • The user kills NetBeans or shuts down his machine in the middle of import

You get the idea. As many error cases as you can think of should be written down and then please describe how the tool handles it. The primary requirements are (a) the user is given enough information to figure out what went wrong and to fix it (b) neither the source nor target data is corrupted in some way.

DVC16 - Export to Table - Overall

Do we really need both export to table and import to table?

MB : Hum! Very Valid. Export to file is a useful feature but export to relational table might be a reverse case of import into a table. Ahi any comments ?

DVC17 - Export to Table - Triple-doller should be triple-dollar (spelling) - Approved

MB : This has been corrected (rather replaced with more common delimiter types) for this spec.

DVC18 - Export to Table - user-defined delimiters - Approved

Please describe how user-defined delimiter works

MB : Field and Record delimiter have an item called "User Defined" each. On selecting this from the list, combo box becomes editable and allows user to specify a delimiter of choice. Spec has been updated with this explanation.

DVC19 - Export to table - specifying a file name - Approved

How do you specify a different folder when specifying a file name?

MB : This has been explained under extended section.

DVC - OK, I will add comments below. By the way, it is very strange to have all these "extended sections", it would be great if you could just incorporate them into the main spec. This will be particularly helpful for QE when they want to test this feature.

DVC20 - Export to file - Showing exported data - Approved

It's not clear when you show the exported data in NotePad - is this brought up automatically, or is it just stored in a file (I would prefer the latter)

MB : This is just stored in a file. The data is shown to be exported by opening it up in a notepad for capturing it into the spec.

DVC - OK, you might want to clarify this in the spec.

DVC21 - Export to file - Export to RSS - Approved

What exactly does it mean to "export to RSS?" - please describe in detail what happens here and what the configurable options are

MB : RSS is not being considered as an option of a file import or export here. Fixed Width, TXT, Spread Sheet , CSV and XML are the only supported file options here.

DVC OK, the original spec showed RSS

DVC22 - Export to file - Export to HTML - Approved

What exactly does it mean to "export to HTML" - please describe in detail what happens here and what the configurable options are

MB : HTML is not being considered as an option of a file import or export here. Fixed Width, TXT, Spread Sheet , CSV and XML are the only supported file options here.

DVC OK, the original spec showed HTML

DVC23 - Export - Failures

What happens under failure conditions (file not found, error writing to file, error reading data from source table, etc.)

MB : Some of the errors (file not found) will be caught at the wizard level itself. Others can be prompted out as the Import/Export is in progress or at completion. Ahi, Pls comment.

DVC Each potential error condition and how they are handled need to be explicitly described in the spec. This is very important, as it is the source of many bugs for both QE and our users. It's important to think this through thorougly during design.

DVC23 - Export - Cancelling

How do you cancel a long-running export? What happens if you do cancel?

MB : Would need Ahi to comment.

DVC - This needs to be handled before this feature can be considered complete.

DVC24 - Export to file - Data type formatting - Approved

How do you configure the formatting for certain data types, particularly date and time?

MB : This has been taken case with some enhancements in spec . Refer this.

DVC - OK, I will comment below.

DVC25 - Locale-specific formatting - Approved

How do you figure out the default date/time format (do you use locale information?)

MB : User need to provide this while crafting the inport/export through the wizard. Refer this.

DVC - OK, will comment below.

DVC26 - Exporting from views or results - Approved

There appears to be no way to export from a View or from a query result. This would be very useful. I know we can do from a query result today, but we should have a consistent interface for this.

MB : Agreed. I'm capturing this in the Future requirements here

DVC27 - Export to table - connection management

When exporting to a database table, it appears you are providing a UI for creating a new connection (including Test Connection feature). We already have a way for you to pick a connection and optionally create a new connection, and we should use this consistently. Please see me for details

MB : Sure. Lets discuss this. Intent here, however, was to test the admin user/passwd for the existing database connection. Probably this would be needed while importing (both files and tables as well).

DVC I'm not sure I understand what you mean by "test the admin user/password for the existing database connection.". What I'm trying to say here is you sholdn't write your own UI for picking a databaes connection when the API already has such a UI. And what's wrong with the user creating a new connection and using that as the target? This is relevant for import into existing table too (I still think they're basically the same feature).

DVC28 - Import from table

The action should just be "Import.." or rather than "Import Table..." "Import Table" is confusing when you're importing from a file. A file is _not_ a table, to most developers. I know conceptually you can think of it that way, but really for most users they are very different things.

DVC29 - Progress indication

Your spec currently doesn't describe how progress of import/export is displayed. I think a good example of how a long-running process like this should be done is the Install Plugin action. In this case what happens is on the dialog itself a progress bar is displayed, and then if the user wants they can press a button to say "run in backgroun." This is how cancelling should be implemented too, either by cancelling on the dialog or click on the progress bar (if running in background) and choose "cancel task".

DVC30 - Export and Import for XML

Your spec only shows the details of what happens when dealing with CSV, XLS and relational tables. We need to see a full spec for how the UI looks and behaves for the other supported formats (XML and OpenOffice) - meaning, where there are differences, those differences need to be shown. I'm particularly concerned that we know very little about what this looks like for XML.

DVC31 - Selecting a sheet

Nice feature to select a sheet. This should work for OpenOffice documents as well as XLS, no?

DVC32 - CR, CR/LF and Newline

Sorry, what's the difference between CR and Newline? Aren't these the same?

DVC33 - Jumping a few steps

In step 5 of Import, you say "Wizard may jump few steps on this wizard based on the iterations required by the file type being processed." Please describe in more detail what you mean by this

DVC34 - Text qualifiers

I think this is normally called a "quoting string". I suggest using this term, as it is more commonly used.

DVC35 - Precision

Precision is an unclear term for character/variable length types. Normally its reserved for numeric types. How about "default length/precision" rather than just "default precision"

DVC36 - Create Data File If Not Exist

If this feature does not make sense for NetBeans, then it should not be shown. Otherwise we're just confusing our users.

DVC37 - Import Table Metadata

The title for this step is confusing. I think you mean "Define Table Metadata"

DVC38 - Header offset

Huh? Please figure out what this means and describe (or remove)

DVC39 - Tool tips for entry fields

I would like to see tool tips for each of the entry fields of the wizard, providing descriptions similar to the ones you are providing in the spec

DVC40 - Field delimiters for fixed width files

If the records are fixed width, then you could potentially have a file with no delimiter between fields at all, as they are fixed width... I don't think you can assume there is a space between fields.

DVC41 - Import and auto-increment

How do you create a new table on an import where you want a field (e.g. the PK) to be auto-increment?

DVC42 - Additional columns

What if you want to create columns that don't exist in the source file?

DVC43 - Removing columns

What if you don't want all fields in the source file to end up in the table? There doesn't appear to be an option to do this...

DVC44 - Import into existing table

The description of the flow seems to imply that you have to first type in a target table that's new, and then you are given the option to say "import into existing table." I know I would be confused by this if I wanted to import into an existing table.

I actually think that if you want to import into an existing table, one of the first things you would want to do is select which table, and then you go about the business of choosing an import source (file or other table), etc., etc.

BTW, importing into an existing table is the more common use case so we should design with that in mind and make sure that task flow is the easiest and most natural.

Anyway, I would expect the choice to import into an existing table to happen on the original screen, not in a subsequent screen. Make sense?

DVC45 - Import into existing - Selecting an existing table

The UI seems to assume that you can only run "Import" from a certain table node, and that is what lets you figure out which list of tables to offer when importing to an existing table. We should be clear that we *won't* support import at say the Connection node or Schema node.

DVC45 - No next button?

The "Select an existing table" screen shot doesn't seem to have a next/previous/finish set of buttons... Is that intentional? How does it fit in to the rest of the wizard if that's the case?

DVC46 - Import into existing - View preview

This seems a little odd -- it seems like you're saying that the same page where you preview the "new" table transforms itself into a completely different UI where you define a mapping. It seems to me that we should think this through a little more. Perhaps this is OK, but it seems a little odd...

DVC48 - Import into existing table - auto-increment fields

Do you allow import into auto-increment fields? I hope not :)

DVC49 - Import into existing - Error handling - Multiple target fields

Do you allow the same source field to be mapped into two separate target fields?

DVC50 - Import into existing - Error handling - invalid type match

How do you handle type conversions and invalid type matches (e.g. a date in the source being mapped to an int in the target)?

DVC60 - Choose fields for export

A more common UI layout for selecting fields to use is to have those excluded on the left and those included on the right and have arrow buttons to move a field to the included or excluded list. But this works too

DVC61 - Choose fields for export - Export path

I would put this first, before the name. My preference. I don't know if there's any standard for this

DVC62 - General - Required fields

A general comment, you need to follow the standard model for indicating when a required field has not been specified. You can see me or John Baker for details.

DVC63 - Choose fields for export - Specifying output file

There is a standard approach to creating a file of a certain type so it uses the standard suffix. For an example, please see File->New File->New HTML file. Note how they let you browse to a folder and specify a name, and then it shows you the resulting full path to the file. You should follow this standard.

DVC64 - Select Data Export Properties - Dates

I am not sure about this approach, where you specify a date format and delimiters for date and time. I think instead we should support some standard formats, and then let users specify any format they want using the standard SimpleDateFormat patterns. Also, we should default to the appropriate format based on the user's locale, while letting them change the output.

DVC65 - Select Data Export Properties - Decimal

We should default to the decimal delimiter that is correct for the current locale

DVC66 - Select Data Export - Character encodings

Don't we need to support exporting to different character encodings? Or are we always exporting to UTF8. That seems like it might be a problem...

DVC67 - Select Export Columns when exporting to table - choosing a connection

Please, use the existing UI for selecting a connection. Don't do all this stuff with creating a new connection and testing the connection. Just use what we have. It doesn't need to be an admin user and password. Any database user with correct rights should be able to run this wizard.

DVC68 - Export to table - selecting a table.

Again, selecting the table should be the first thing you do, not something you do after selecting the source table and defining the import properties.

DVC69 - Import using user-defined data and decimal formats

I think this is required to support an international user base.

DVC70 - Backup/restore feature

We are not going to do this in NetBeans - it is out of our scope, and everybody uses the backup/restore features of the database vendor. If you want to provide this in your product, that is up to you, but I really don't think this belongs in NetBeans.

Not logged in. Log in, Register

By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo