Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Justin Derrick

Pages: 1 ... 110 111 112 113 114 [115] 116 117 118 119 120 ... 133
1711
MP Server / Re: Ignoring blank PDF files
« on: June 20, 2012, 02:48:12 AM »
That sounds like a bug in the source system.  Since CMOD is an archive, an empty document is something I'd consider an error.  (Why would I want to archive an empty document?  If there's a valid reason for having no results in a report, the report should contain a notice saying so.)

1712
Yeah, I'd recommend dramatically increasing the size of an individual object -- I wouldn't go any larger than 100MB.  This WILL cause performance problems for users with slow network connections, as the size of a single (ie, the first) object is now much larger.

If you have large object support enabled as well, I'd increase that number to 1000 pages.

Good luck, and please let us know what the solution to your issue is!

1713
Report Indexing / Re: Loading AFp using generic indexer.
« on: June 01, 2012, 07:40:05 PM »
CMOD will know it's AFP data because you'll be loading it into an Application that is defined as having AFP as the data type.  CMOD should search the .res file to find the resources that are referred to in the AFP data stream, and turn them into a resource bundle.

It's a good question -- I'm not EXACTLY sure how this is handled when using a Generic Index -- but I have done it, and I'm quite sure that it works.

Test it out and report back!

-JD.

1714
This is sort of a hole with no bottom.  :)

You can avoid the most common (and painful) mistakes by considering the following:

  • Database volumes & cache filesystems should be on your fastest, most reliable disk technology.
  • Index & temporary filesystems can be on your next tier (ie, less fast, less reliable, less expensive).
  • For the database, try to match RAID stripe size, filesystem block size, and database page size.
  • Ensure that there are no artificially low limits on filesystem sizes for database & cache -- they will grow.  A lot.

  • Read the installation guide to learn how to calculate compression ratios, so you can accurately estimate the required size of your cache.
  • Create separate 'landing zone' directories for each application group. If one AG's data volume becomes a problem, you can put it into it's own filesystem where it won't gobble up all the space and impact other AGs.  This is especially important when you're unable to load data due to downtime/bugs.

  • Don't store your own data (scripts, logs, resources) under /usr/lpp/ars.  Logs should go into their own filesystem.  System-related scripts (OS-level performance monitoring & error reporting) should go into /usr/local/.  CMOD-specific scripts should go into /home/archive/bin.  I recommend creating a structure like /home/archive/resources/AGName/AppName/Version for storing AFP resources that you need to store locally.


That's just off the top of my head, I'm sure there's many more that others can add.

Good luck!

1715
You should be able to modify these indexing parameters manually inside the CMOD Admin client.  Simply update the INDEXSTARTBY parameter to be "2", indicating that it should find index values starting on Page 2 of your PDF.

1716
MP Server / Re: Folder for multiple application groups
« on: May 24, 2012, 05:51:02 AM »
For performance reasons, it's best to create Application Groups with multiple Applications, rather than binding a variety of AGs together at the folder level.  This is especially true if one or more of the AGs has tens of millions of rows.  If it makes sense, consider consolidating your AGs.

If a new consolidated AG doesn't work in your scenario, keep the number of mapped AGs to a minimum, and increase the number of rows per database table for large AGs (ie, more than 10 million documents per month).

Also, turning off 'Display Document Location' if it's enabled will help maintain performance.

-JD.

1717
You'd normally trigger on something that only appears on the first page of a document, like "Page 1 ".  (Notice the space after the 1 to avoid triggering on pages 10-19, 100-199, 1000,1999!)

Are there page numbers on your report?

1718
MP Server / Re: Capturing job log files on AIX
« on: May 22, 2012, 11:24:57 AM »
If your index information is in a reliable (aka fixed/permanent) location inside the output you're loading into CMOD, you could build a series of Applications with their own ACIF index parameters to capture that index info.

There is a huge perk to using the generic indexer though -- you can concatenate an entire day's worth of job logs into a single generic index file, and you would reduce overhead for the loads, and improve compression performance.

1719
z/OS Server / Re: How to read attached documents in the system log
« on: May 08, 2012, 05:59:12 AM »
I want to set up an automatic read process of the attached documents to scan the text to determine the actual number of rows loaded and whether the documents were unloaded or not.

It sounds like the this is a custom application, since one needs SQL to get the syslog messages, then other API or DOCGET calls to pick up the text to read and parse.

That would be so useful to have as part of administration tools.

I'm not sure if you have the "arslog" logging exit in z/OS -- but if you do, I'd use that to check for message number 88, and either write the log output you're interested in to a file, or take some specific action (like issuing an unload, or trying again).

1720
MP Server / Re: segment field and index field
« on: May 02, 2012, 06:20:05 AM »
The advice I normally recommend is that you find the 'mostly unique' field, like Customer Number, and index ONLY that field, and require that field be used for searches.
And if the user needs another entrance then it is too bad for them?

No, then you need to re-evaluate the use case, and your indexes.  In the overwhelming majority of cases, a single field contains the information needed to narrow down your results to a reasonable, useful number of records.  

There are a number of reasons for requiring specific fields for searches -- availability and security are two that I can think of off the top of my head.  By forcing someone to use an indexed field in their searches, you ensure that they use the most efficient method of searching the database -- rather than gobbling up all the CPU and I/O on the server doing table scans, slowing down other users.  Also, by forcing the usage of, say, a customer number field (and forbidding wildcards), you prevent unscrupulous users from producing information like customer lists to sell to your competition.

Again, there is no one, single, best answer to a general question like this.  Each situation is different, and can use a variety or combination of methods to get the best performance.  Your best bet is to experiment and understand how CMOD and your database engine work at a lower level, so that you can choose the methods that give you the most benefit for the smallest cost.

-JD.

1721
MP Server / Re: segment field and index field
« on: May 01, 2012, 06:58:05 AM »
Query performance is very tricky -- and there a lot of things you can do in CMOD to tweak performance.

It really depends on what your data looks like.  Indexes are simply a list of database pages that contain the index value you're searching for.  If you only load one document per day into an Application Group, then indexing the date field might be all you need, since any search for a single date will only require you to load up one database page to find the record you're looking for -- but that's not common.

The advice I normally recommend that you find the 'mostly unique' field, like Customer Number, and index ONLY that field, and require that field be used for searches.  This means that in a 10 million row table, your customer number might show up 10 times -- and by using the index, DB2 knows it only needs to read 10 pages to complete the query.  Then, it uses criteria from the other fields (date, etc.) to discard from the short list, and return exactly what's required.

-JD.

1722
MP Server / Re: Excel indexer?
« on: April 28, 2012, 03:10:02 PM »
Perl has modules for reading Excel spreadsheets directly.  I'd highly recommend using them.

1723
Report Indexing / Re: Unique key in CMOD
« on: April 25, 2012, 03:40:10 AM »
I think the problem here is that it's difficult, if not impossible to enforce a unique key on an Application Group -- because CMOD uses multiple tables to keep query performance linear as the number of documents in the AG grows.  You might be able to enforce uniqueness of a key within a single table, but I don't think that's what you're looking for.

You might be able to get very close to what you're trying to do by increasing the number of rows per table to a very large number, like 100 to 200 million -- depending on your volume, a large number like this might cover all your documents for the duration they're kept in CMOD -- or at the very least would make sure that your documents have 'mostly unique' key fields.


1724
MP Server / Re: Not sure: is the cache used for retrieval as wel?
« on: April 23, 2012, 04:10:34 AM »
Documents retrieved from TSM don't go back into the cache filesystem.

1725
MP Server / Re: CMOD slow during online DB2 backup
« on: April 19, 2012, 03:46:55 AM »
You're obviously constrained somewhere -- CPU or IO or RAM.  It's really up to your System Admin to find out. 

If you ARE the System Admin, check on the CPU/IO stats with iostat / vmstat / topas.

It's probably time for a hardware upgrade though -- my personal development server (p510Q) is more powerful than your production server!  :D

-JD.

Pages: 1 ... 110 111 112 113 114 [115] 116 117 118 119 120 ... 133