SCC Active Contracts - Data Issues

The data is stored in a MySQL database with this schema.

Contract Documents

Ideally, the County would make the contract documents themselves available to the public, but they do not. There is a contract "owner" within the Procurement Department who might answer questions, but I have not found a question that will not be answered with another question. It seems that one must file a California Public Records Act request in order to see a contract. It is, though, easy to do this. Send an email to [email protected] and specify a contract or contracts and they will send you electronic copies, after waiting the allowed time period. I have linked in the contract texts that I have received above. Click on the Contract ID links to see the details for the contract, which will incude the contract documents. If anyone thinks that other contracts should be looked at, please let me know and they can be requested.

Importing

The Procurement Office published two monthly reports which describe the active contracts. They publish these on an "Active Contracts" page. The reports may or may not be published every month. Some have been missed, perhaps because of staff shortages associated with the Covid-19 pandemic. The reports for a given month seem to come out between 6 or 7 and 9 weeks after the end of the month.

The import scripts do some data correcting, but it is not much. There are many vendor names, for example, which are obviously incomplete. As I am able, I find the proper names for these vendors and I display these vendors with their proper names. I do not replace the bad names in the original data. Instead I add a row to a table which joins with the vendors table to provide the correction. I try to make all corrections by adding correct data, but not by removing the original data. If I replaced these original values, I would be breaking the connection between my data and the data published by the Procurement Office. If I make fixes, I always want these to verifiable now and in the future against the Procurement Office data.

Ultimately, the accuracy of this data cannot be completely guaranteed. The data is published by the Procurement Office as PDF files and the process of reading data out of PDF files is inherently error-prone. Indeed, one does not use PDFs if one wants to encourage re-use of the data being published. A PDF is for making the data look good and not much else.

Contract-Months

The main contracts table contains the expected information about a contract but it also points to the month in which a contract appeared. This causes some duplication. For example, if a contract ran from January 1, 2021 to May 31, 2022 and if every month had a exported PO report, then there would be 17 rows in the contracts table for this contract. If the contract was written and never changed, then all 17 rows would be exactly the same. Many contracts, though, do change and I am keeping track of when those changes appear in the exported data.

At any one time, the contracts application is showing data relevant to the last month for which data is available. There may be delays in my processing this data. For example, as of now (2022-09-11), the application shows data for June of 2022. Data for July of 2022 has become available sometime in the last week, but that data has not yet been imported.

Contract Total Values vs Year Values

In most places in the contracts application, one will see both the total value of a contract or the year value of a contract. At some points, only one of these will be displayed. What do these mean?

Most contracts cover multiple years. The only pieces of information about this being exported are the "Effective Date", the "Expiration Date", and the "Contract Value" of the contract. So, I am doing what I can. To determine the "year value" for a contract for a particular year, I divide the start and stop dates of the contract into their years and apportion the total contract to each year. For example, if a contract is due to pay $2 million and its effective date was 2021-04-01 and its expiration date was 2022-03-31. So, the the portion of the contract period in 2020 covered 9 months, the period covered in 2021 was 12 months and the period covered in 2022 was 3 months. So, the year values would be:

total value      $ 2,000,000.00    covers 2020-04-01 to 2022-03-31
------------------------------------------------------------------
2020 value           750,000.00    covers 2020-04-01 to 2020-12-31
2021 value         1,000,000.00    covers 2021-01-01 to 2021-12-31
2022 value           250,000.00    covers 2022-01-01 to 2022-03-31

The calculations are actually based precisely on the number of days of the contract period covered in each year, but this example hopefully makes clear how the contract value is divided up for reporting.

Note that my "year value" calculation has absolutely nothing to do with when the County is actually spending money on a contract. I am sure that a check register somewhere keeps track of precisely when checks are being sent out for these contracts. Is there some way that the County links these payments to the contract behind them to enable oversight? One would hope so, but I am not sure how or if the public can get the specific information about these payments from the contract information.

Contract ID Numbers

There are three kinds of ID numbers for contracts. The reasons for this have to do with what system was being used to track a contract when it was first seen. In the monthly "Contracts Report", these IDs appear under the "Contract ID Ariba" and "PO ID)" heading. In the "SA BC Report", the ID appears under the "Contract ID (PO ID)" heading. In how many contracts are which IDs defined? The data below shows the usage of these IDs in the data for June of 2022.

    count    contract_id   ariba_id     sap_id
    1327      NON_NULL      NULL         NULL
     127      NULL          NON_NULL     NON_NULL
     496      NULL          NON_NULL     NULL
     271      NULL          NULL         NON_NULL

I have recently noticed (July 2023) that there are two types of id sets. In one case, the contracts are listed in a report with a title like 'Contracts Report for Month of April 2023.pdf'. In this report, each contract has an Ariba Id, an SAP Id, or both and the set of these two is unique. Also, A single contract can point to more than one agency. For example, a contract with an Ariba Id number 'CW2239242' with 'VICTORY SUPPLY' relates to:

Dept. of Correction - 240
Sheriff's Office - 230
Sheriff's Court/Custody Oper. - 231

In the other case, the contracts are listed in a report with a title like 'SA BC Report for Month of April 2023.pdf'. In this report, each contract has a Contract Id and only one agency is connected to each row. But the Contract Id may appear on more than once. But the Contract Id, Agency pair is unique.

Therefore, in order to track a contract from month to month, one must look for whichever pair of values is appropriate. Of course, a contract can be amended and one of its Id numbers can change. It is not clear how this can be tracked. One could combine the contract rows but the SA/BC report shows how much money goes to each agency and the Contracts report does not, and it would not be good to lose that detail.

An example is:

        Agency: Controller-Treasurer Dept - 110
        Contract Id: 4300018669
        Vendor: AYA HEALTHCARE INC
        Start: 5/15/2020
        End: 6/30/2023
        Amount: $ 95,885,456.93
        Dewscription: Consulting Services

        Agency: Facilities and Fleet Department - 263
        Contract Id: 4300018669
        Vendor: AYA HEALTHCARE INC
        Start: 5/15/2020
        End: 6/30/2023
        Amount: $ 5,000,000.00
        Dewscription: Consulting Services

        Agency: Health & Hospital Systems - 921
        Contract Id: 4300018669
        Vendor: AYA HEALTHCARE INC
        Start: 5/15/2020
        End: 6/30/2023
        Amount: $ 38,370,967.07
        Dewscription: Consulting Services

This vendor has other contracts and, as makes sense, they have different Contract Id numbers.

The only place we actually need to identity the contract is when we are relating documents to a contract. And listing the contracts multiple times is not unforgivable. If there are three documents for the contract above, showing those three documents three different times should not be a problem. A page that shows a single contract and its history would want to combine these contracts into one.