Transition Strategy ¶
To enable a replacement of WhenIWork driven by design and user feedback, IndeVets’ technical infrastructure will evolve through a series of phases that enable WhenIWork- and IndeVets-provide user interfaces for doctors to co-exist.
The approach we will employ is an expression of the so-called Strangler Pattern
Strategy ¶
Desynchronized Transition ¶
As opposed to a hard cutover, a desynchronized transition allows us to release minimal versions of replacement user interfaces as early as in their design and implementation as possible. Minimal features will be released one at a time, and to controlled subsets of users. Then, through testing and feedback in real-world use, the features can be gradually refined and expanded to meet and then exceed WhenIWorks’ capabilities and ease-of-use. We will be able to control on a feature-by-feature basis what users have access to new interfaces without risking or disrupting existing workflows.
For example, a timesheet entry feature might be tested by admins and a small set of compensated alpha-tester doctors, then released to a small set of beta-tester doctors, then released to a portion of regular doctors, then released to all doctors except a few that have an unusual dependence on WhenIWork, and then released to the final doctors. Along the way, the feature would develop from bare-bones to full-parity to exceeds-expectations with as much investment as possible deferred at each stage to benefit from real-world results and learnings.
Minimal Disposable Code ¶
To minimize how much development time is invested in components that will ultimately be discarded after WhenIWork is removed from operation, we will use the off-the-shelf and open-source laravel-auditing
component to achieve the “EventInterception” and “AssetCapture” components of the strangler pattern.
This same component will also fulfill Core’s original design goal of providing audit logs for all business records, which will become increasingly critical for inspecting edge cases as we expand Core’s volume of managed work, totality of automation, and surface area of responsibility. By leveraging the same component for both Core auditing and the WhenIWork transition, we can focus investment on capabilities, interfaces, and patterns that Core will continue using and building upon after WhenIWork is phased out.
Even the thin layer of code written specifically for When I Work will provide continued value as a template for future integrations with 3rd-party data systems, as its dependencies and integration points will continue to be maintained as part of Core.
Phases ¶
Phase 1: Auditing for All Core Entities ¶
Before Phase 1 ¶
Core only stores current versions of all its records in PostgreSQL.
After Phase 1 ¶
Core maintains an audit table in PostgreSQL that tracks before/after values alongside author and request metadata for every change made to records.
Phase 2A: Auditing for WhenIWork Shifts ¶
Before Phase 2A ¶
Core scans WhenIWork only for changes that are relevant to Core records, and only when manually triggered by an Admin via the Pull WhenIWork Entities or Pull WhenIWork Shifts buttons.
A separate snapshot tool runs automatically every hour and saves unstructured snapshots of every record reachable via the WhenIWork API to an archive.
After Phase 2A ¶
A new automated job, perhaps borrowing from the snapshot tool, scans all shifts every hour and mirrors them into a wiw_shifts
table in PostgreSQL that is backed by a Laravel model. This table mirrors the format of data provided by the WhenIWork API as closely as possible, transforming values only as needed to accurately capture them in structured PostgreSQL columns. The scan should retrieve all records accessible via WhenIWork’s API, and mirror any deleted or other status flags.
The Laravel model will apply auditing, and changes made by this automated job will be attributed to a system when-i-work
user in the audit table. This will gain Developers and Admins the ability to review the timelines for shifts while investigating unexpected behaviors, and give Core the ability to hook independent actions onto the post-audit event fired by the auditing package when a change to any specific WhenIWork record is detected.
Through tracking an additional synced_at
timestamp, this full records scan will also delete from our system any records that disappear from WhenIWork by checking for any untouched synced_at
values at the end of a refresh, ensuring that such deletions are captured in our audit log. The synced_at
timestamp will be excluded from auditing so that syncs with no changes do not generate audit log entries.
The major caveat here is that we won’t be able to capture multiple changes happening between scans. We may be able to accelerate scans to happen more often than hourly, but it is unlikely we could ever accelerate it enough to record series of changes that happen in rapid succession. The audit log entries captured will always be attributed to the time the scan ran, and then when-i-work
system user. This is an inherent limitation of When I Work being a system that neither exposes edit history through its API or offers a way to push changes.
Once this is launched to production and verified after a trial period, we will discontinue the unstructured snapshot tool’s capturing of shifts to reduce excess load on WhenIWork.
Phase 3: Accelerated Scanning for Open Shifts ¶
Before Phase 3 ¶
Core’s wiw_shifts
cache table may be out of date by up to whatever interval we run the full refresh at. Initially, we’re planning to do this hourly as that’s how often our existing unstructured snapshot tool has been scanning the entire WhenIWork API. It may be possible to run full refreshes more often than hourly, but full refreshes take a while to run and triggering them frequently enough for one to start while another is still running could cause cascading system failures.
Further, increasing the frequency at which we do full refreshes may cause enough load to WhenIWork’s systems that they investigate it as potential malicious use. Currently our hourly snapshots have not triggered any automated rate limiting or manual intervention by WhenIWork, and it would be risky to push the bounds until they take action.
After Phase 3 ¶
Because there is one type of change happening in WhenIWork that we care about having fresh data on above all other types of changes originating in WhenIWork—shifts being taken by doctors—we can narrowly target that change for more frequent scanning.
A new automated job, independent of the one developed for the full shifts scanner described in the previous phase, will scan only open shifts (via WhenIWork’s include_onlyopen
) scheduled in the next 90 days. It will bump synced_at
and update any changed fields in examined shifts, but will not delete any un-synced shifts at the end of its scan like the full refresh job does. It will also need to be prevented from running while a full refresh is in progress.
Based on how long this abbreviated shifts scan takes to complete on average in production, we will schedule it to run as aggressively as we are confident we can get away with—perhaps as often as every 5 minutes.
Phase 4: Event-driven Updating of Core Shifts ¶
Before Phase 4 ¶
An admin manually triggers pulling shift updates from WhenIWork by clicking the Pull WhenIWork Shifts button periodically. Logic for updating Core records from WhenIWork data is tightly coupled within the code for scanning through all shifts in WhenIWork’s API. The freshness of shift status in Core is inconsistent and usually highly delayed.
After Phase 4 ¶
A listener for laravel-auditing
‘s Audited
event will be set up and configured to ignore any audit events that are not to the WhenIWork\Shift
model or that don’t include any fields Core cares about. Perhaps a separate listener might be set up for each discreet type of change we care to pick up from WhenIWork and apply to Core.
Any changes to Core records triggered by WhenIWork scans via the Audited
event should be attributed to the when-i-work
system user in the resulting audit records, just as the audit records for updates to the WhenIWork cache table(s) are.
When this feature is complete, the PullShifts
job should be deprecated.
Phase 5: Transactional Taking of Shifts via Core ¶
Before Phase 5 ¶
We do not currently offer any ability for Doctors to take shifts via Core, but will begin to offer it to a limited set of users through a pre-release version of the new Shift Shopper UI.
Doctors browsing available shifts via Core’s new Shift Shopper UI will be seeing potentially stale open statuses until WhenIWork ceases to be the system of record for taken shifts. Even after Core becomes the system of record for taken shifts as WhenIWork is deprecated, it will be possible for Doctors to see open shifts or their mobile screen that have been taken by another Doctor since they loaded the screen. At some point, we will want to provide real-time updating of Doctor screens when shifts are taken, but even then we would still benefit from baking a confirmation step into taking a shift.
After Phase 5 ¶
A reusable action for “taking” a shift will abstract and cover all steps needed to ensure a complete and thorough transaction, returning either a confirmation that the Doctor has been assigned the shift, or a specific error code for why they could not take the shift. In addition to a take being rejected because another Doctor took the shift already, accommodations should also be made for takes to be rejected due to future policy constraints that the UI may not be able to reflect.
The initial version of this action should complete the following steps:
- Query WhenIWork API live to check whether shift is still open
- Reject take server-side and show a friendly message to the user if it is not still available
- Post the take to the WhenIWork API and confirm expressly in the response that the take was successful
- Use the response to update the cached
wiw_shifts
record immediately, attributed the resulting audit log entry to the current user - Update Core’s shift record immediately, attributing the resulting audit log entry to the current user
By updating the WhenIWork cached record live in-band and attributing it to the acting user, we will be able to segment changes applied to WhenIWork via Core UIs and those originating from WhenIWork UIs when analyzing the audit stream.
Phase 2B: Auditing for All WhenIWork Entities ¶
Before Phase 2B ¶
Only WhenIWork’s shift records will be fully cached in Core and generating audit logs.
After Phase 2B ¶
A new automated job, modeled after the one developed in Phase 2A to fully scan and cache WhenIWork’s shift records, will capture all other WhenIWork entities currently captured by the existing unstructured snapshot tool in an identical fashion.
The job from Phase 2A for fully scanning and capturing all shifts in WhenIWork will likely be our longest-running interaction with WhenIWork, and should be kept in its own job. Capturing all other WhenIWork entities may happen in one consolidated job, or be split out with one per each entity type. A single job for all non-shift entities may be best, as they should all be relatively short lists we can get through quickly and it may be good to keep them getting done consistently in one serialized batch.
Timeline ¶
Features ¶
Shift Shopping ¶
Doctors can browse open shifts in a mobile-first UI and take shifts they want on their schedule.
Schedule ¶
Doctors can see all shifts they’re assigned to in a simple list, and/or add a calendar subscription feed to their personal calendar application.
Timesheets ¶
Doctors can clock in and clock out of shifts they’re assigned to, with current location captured so we can create timesheet entries compatible with those generated by WhenIWork.
This will be dependent on the Schedule feature providing actions in the context of specific shifts the Doctor has been assigned to.
Links for clocking in/out may be provided within the Doctor’s personal calendar application via the calendar subscription feed, and may also be provided via email notifications sent to the Doctor.
Risk: Low-quality WhenIWork API for timesheets ¶
The WhenIWork API documentation around managing timesheets appears to have been copy+pasted from other sections and inadequately edited. This may indicate that this section of the API is not heavily used by customers, and therefor may be poorly maintained and supported. We may find dysfunctions in practice that limit or impair our ability to manage timesheets externally while keeping the data up-to-date in When I Work.
First, we need to conduct basic testing of the read/create/update/delete cycle for timesheet entries. Even if that appears to work though, there is risk that as we expand operations we’ll find specific cases that the WhenIWork API does not handle well and that WhenIWork developers are unable to prioritizing fixing for us.