Project Open Custom E-Mail Notifications - Part 3
In parts 1 and 2, I presented the queries to pull individual workflow steps and closed items out of Project Open. The results of those queries represented the raw data we'd need to send e-mails. Trouble is, every time we'd run the queries, we'd run the risk of sending and resending and resending e-mails. That could become problematic. Folks could unfairly blame Project Open for inundating their inboxes!
Obviously, we need a mechanism to track what e-mails we've already sent. If you remember the queries from the last two articles, you'll remember that the column task_id uniquely identified a given task. Because of how the queries join tables, it appears that the task_id can appear multiple times. But in reality, the task_id uniquely identifies a task. That means we need a mechanism to know for what task_ids we've sent e-mails.
I can think of three realistic ways to approach this. I'm going to present them so you can see the rationale I bring to the table any time I "extend" Project Open (or any other open source project).
- Extend Project Open: I could use Concurrent Version System (CVS) to checkout an appropriate version of Project Open and upgrade the SQL, tcl, and other source code to customize the notification subsystem. I could then offer the code back to the project for consideration as part of the code base. In this case, I didn't think this was a good option for two reasons. First, I don't know tcl well enough to inflict my code on the project team. They have families and lives -- they don't need that kind of horror! Second -- and more importantly -- Project Open already has a robust notification system. I just couldn't figure it out. I decided against this option.
- Add a column to wf_tasks: I could use an ALTER TABLE statement to add a column directly to wf_tasks. It could be a simple integer, and I could programmatically set the value to 1 after I send the e-mail. This would be simple to implement and use, but I decided against it. I don't like to directly manipulate a table whose structure affects a lot of downstream processes -- processes I don't fully understand. In other words, I'm afraid I'd screw something up, and then I'd be embarrassed and the people who depend on me to keep things running would be cross.
- Add an independent table: I could add a table with a column to hold the task_ids for which I've sent e-mails. It's easy to setup and if I keep it small (and well-indexed), it should have no impact on the application of PostgreSQL performance. I chose this option.
Even though I think creating a small table is a safe option, I still approach the task with respect for the overall system. Project Open is a huge application, and one false move on my part could have unintended consequences. I can say that I've used this e-mail solution for some time now, and it appears to have no negative impact on the system.
Once the table's in place, I can adjust the query to show open workflow steps to look like this:
SELECT a.task_id, a.workflow_key, a.transition_key, b.*, c.description, d.transition_name, acs_object__name(e.user_id) AS thename,
f.email
FROM wf_tasks a, wf_cases b, wf_workflows c, wf_transitions d, wf_user_tasks e, parties f
WHERE b.case_id = a.case_id
AND c.workflow_key = a.workflow_key
AND d.workflow_key = a.workflow_key
AND d.transition_key = a.transition_key
AND e.task_id = a.task_id
AND f.party_id = e.user_id
AND a.state = 'enabled'
AND a.task_id NOT IN (SELECT pns_task_id FROM pona_notifications_sent)
ORDER BY task_id DESC
The same query for closed items could look like this:
SELECT a.task_id, a.case_id, b.object_id, b.state, c.description, d.transition_name, f.email
FROM wf_tasks a, wf_cases b, wf_workflows c, wf_transitions d, im_tickets e, parties f
WHERE b.case_id = a.case_id
AND c.workflow_key = a.workflow_key
AND d.workflow_key = a.workflow_key
AND d.transition_key = a.transition_key
AND a.transition_key LIKE '%close%'
AND e.ticket_id = b.object_id
AND f.party_id = e.ticket_customer_contact_id
AND a.task_id NOT IN (SELECT pns_task_id FROM pona_notifications_sent)
ORDER BY task_id DESC
Now, I have four things to do:
- Create the table
- Create an index
- Populate the existing enabled items
- Populate the existing closed items
I created the table with this script:
CREATE TABLE pona_notifications_sent
(
pns_task_id INTEGER UNIQUE NOT NULL,
pns_when_sent DATE NOT NULL
);
It's a simple table to minimize the performance impact. I needed at least a column to hold the task_id; I added a date/time column so I had a little diagnostic information if someone complained about missing e-mails.
This script created the index:
CREATE UNIQUE INDEX pona_notifications_sent_ndx01 ON pona_notifications_sent (pns_task_id);
If I didn't have this index, after the first couple hundred thousand e-mails, I'm pretty sure performance (at least for the e-mail application, and potentially for all of PostgreSQL -- table scans are evil!) would perform terribly.
I found out during my first round of Unit Integration Testing (UIT) that I have to first mark the existing workflow tasks and closed items as having been sent. Otherwise, I end up with several hundred e-mails in my test system! I used this query to populate the open workflow steps:
INSERT INTO pona_notifications_sent
SELECT a.task_id, NOW()
FROM wf_tasks a, wf_cases b
WHERE b.case_id = a.case_id
AND a.state = 'enabled';
Finally, I used this query to populate the closed workflow steps:
INSERT INTO pona_notifications_sent
SELECT a.task_id, NOW()
FROM wf_tasks a, wf_cases b
WHERE b.case_id = a.case_id
AND a.transition_key LIKE '%close%'
AND a.task_id NOT IN (SELECT pns_task_id FROM pona_notifications_sent);
I had to add the NOT IN clause because I found. much to my chagrin, that I didn't follow my own advice about better living through standards when I built my early workflows. Some of them had items with an "enabled" status AND a transition_key that contained the word "close". By fashioning the INSERT statement this way, I didn't try to insert a value that violated my unique key index.
At this point, I have all of the Data Manipulation Language (DML) to process e-mail notifications. In the next part of this series, I'll present the program that'll actually send the e-mails.