ARCHIVE: Java SE Automated Change Submission Process Plan
Updated 8/31/2005
We need some way to automate the on-demand need to build and do a
qualification test of the JDK on all platforms.
This document represents the plan for an automated Java SE change
submission system.
The Problem
Currently there are 8 different platforms that the JDK product must
be built on, 2 on Linux, 2 on Windows, and 4 on Solaris (SPARCV8, SPARCV9, i586, x64).
And for each there are two types of builds that are done, a product build and a fastdebug build.
Further, there are a multitude of ways to build the JDK, all the way from full RE control
builds, control builds, partial control builds, j2se source tree only builds,
to partial source tree or incremental builds.
Each of these platforms has specific OS release, patch, service pack, or
software installation requirements.
Every JDK release can also have slightly different OS or compiler version
requirements, further complicating things.
Besides the dozens of engineers having to setup these machines for themselves
or their teams, everyone has
to be able to use them to make sure changes build prior to putback.
Technically, developers only making Java changes aren't required to build
on all the platforms, but that loop-hole, along with various other build
procedure loop-holes everyone use in an attempt to minimize the overhead and
meet their schedules, has created an unpredictable build status for many
integration areas.
Some teams don't have a complete array of build machines, some have rather
antiquated and slow hardware.
So in general the overall situation, and end result is rather inefficient,
haphazard and
can be problematic.
The diligence and dedication of the JDK engineering teams in avoiding any
major disasters in the face of this, is in my view, quite amazing.
In general, solving this problem is probably the most effective way to improve
the existing JDK developer productivity level.
Here is at least a partial list of the specific reasons why implementing
this plan makes sense:
-
Minimize the need for individual teams to have specific and dedicated build hardware,
essentially allowing all the teams to share these resources.
-
Reduce the system administration of the build machines by reducing the number
of them and putting their support in the hands of the right people and out
of the hands of individual engineers.
A considerable amount of engineering time is spent keeping systems up-to-date,
and although we can't completely remove this burden, this will lighten it.
-
Improve the reliability of the builds by using more consistent hardware and
configurations between teams.
Just knowing that everyone has used the same machines
and same system to verify their putback saves lots of investigation when
things go wrong.
Also, preventing build failures by Release Engineering, although rare,
is considered a big plus.
-
Improve the quality of the builds by providing an easy way to verify
successful builds on all platforms prior to putbacks.
-
Ultimately reduce our total need for build hardware.
-
By tracking and recording the builds, isolating regressions in functionality
or performance becomes possible, isolating regressions to specific putbacks
and improving the overall time needed to get a regression fix in place.
-
Enable the development engineers to spend more time focused on their specific development
areas rather than having to track down build machines, or worry if something will
build on all platforms.
By guaranteeing that your changes won't cause a Release Engineering build failure,
it allows you to focus more on the specific functionality changes you are making.
Some Background
The Hotspot team has had imgr and PRT as source tree integration tools for quite some
time, and if you ask any of these engineers what they would do if you took it away
they would probably look at you in complete shock and assume you were nuts.
Nobody is saying PRT is perfect, but it is for the most part a functioning system
that is providing productivity benefits to their entire team.
Other benefits include a history of built bits (saved by PRT), ability of testing
and the quality teams to access the latest build of any given team,
ability to investigate regressions (performance included) down to individual team member putbacks,
ability to test build or dry run your changes without a putback,
and
a consistent set of putback comments for integration into higher level source trees.
PRT hasn't solved all the issues, but in my view it has solved the build issue
very well, and could potentially solve more.
Getting more use of PRT will likely get us on the road to making it better.
Choices
I'm obviously biased toward PRT, but I think it's justified.
I won't completely rule out a different mechanism, but I have a hard time
thinking that doing anything but extend PRT makes any sense at all.
It IS my intention that whatever is done to PRT to make it do JDK builds/tests become
part of the PRT sources, I do not want to copy PRT and create PRT-clone, we need to
evolve the existing PRT, not create a new and different beast.
Plan for a JDK capable PRT system
PRT allows for separate instances, we have a PRT east and a PRT west, two
different labs, two sets of machines, so the
current thinking is that we would start with an experimental PRT-JDK west
instance with it's own set of machines to start.
Target j2se source tree putbacks to start, with minimal testing qualification to start.
As the project progresses we can determine at a later date if it makes sense to
have a single PRT west system and/or what they would want to do with PRT east.
If done right, we can allow for any PRT system to build hotspot or JDK, but
the logisitics and needs of the VM teams may be a basis for keeping these
systems separate, that remains to be seen. At a minimum the various PRT systems
can serve as
backups to each other, just as PRT-east and PRT-west do now.
Eventually all changes to the PRT source based should be putback to the master
PRT source base, depending on the level of PRT changes for Hotspot, this sync up
must be done eventually, but we do not want to break the existing running PRT systems.
PRT was initially designed to allow for a JDK build, but was never implemented.
I've been told that experiments with the PRT source base could be done with
one machine doing everything, and limiting yourself to one platform.
That's probably how the PRT source changes will be started.
Other concerns about using PRT:
-
Downtime, hung jobs, and the PRT babysitting problems. Problems that can be fixed?
This will take time. The general feeling is that this isn't PRT itself as much as the
reliability of the platforms and hardware.
-
JDK demands on disk space storage, 3X what hotspot needs? Or more? (just saving images)
We will need to make sure we have plenty of disk space and that we only save what is
absolutely necessary.
-
Build machines may need additional components installed SDKs, DirectX, unicows, etc.
This should be a minor issue, and a one time machine setup.
-
JDK testing issues need to be resolved. Traditionally the PRT testing for hotspot
has been fairly minimal and limited, with good reasons (limiting impact on throughput).
It's not clear that just adding hardware solves this issue.
The j2se tests run will need to be reliable tests, and we may need to be selective.
Tests that are know to occasionally fail, for whatever reasons, will need to be
screened out.
Further testing expansion will involve the existing hotspot usage too, so this
issue will be shared will all PRT uses.
What needs to be done?
-
This plan written up. DONE
-
Develop Initial Hardware/Machine needs (See
PRT/JDK Resource Needs). DONE
-
Setup, once the hardware has arrived. IN PROGRESS
Initial hardware needs to be installed with the required build
OS/patches/compilers/tools etc.
(We should try and automate whatever part of this we can, if at all possible).
-
Start PRT source changes to allow for a j2se source tree build. IN PROGRESS
This can be done in parallel to the above activity, and be done on separate
hardware dug out of offices or the lab.
-
Once operational, at a minimum, request all MASTER j2se integrations use it?
Future Considerations
Items to consider once we have an operational system:
-
Which putbacks will be required to use PRT-JDK? Master integrations only at first?
As many as it can take, we will find the load limit on the system.
-
How do we determine which tests to run? We can't run them all.
Based on the files putback? Or some prt_submit option?
-
Allow for full control builds, deploy builds, and deal with putbacks to multiple
source trees?
-
Further hardware expansion for build or testing machines may be necessary
but will be hard to gage until we have a build system up and running.
Build machines need to be specific releases and be configured in specific
ways, but test machines would be of various types
(e.g. Windows 98/XP/2003, Linux 31 flavors, Solaris 8/9/10/11, etc.).
PRT does have a concept of a build and a test machine, but we need to
work on our parallelisms during build and tests.
-
Complete or subset builds on all or selected build platforms.
-
Optional additional tests on selected platforms.
-
Possible automated bugtraq update mechanism? Using 'Fixed NNNNNNN:' patterns from putback comments?
-
Easy command line usage with 'makes sense' defaults.
-
System may need to have blackout periods so that primary integrations can use the system as needed.
-
Builds and configurations should match the Release Engineering team machines.
-
The build/test machines should not be easily accessable for accidental sharing or
disturbance. (separate sub-network?)
-
Ability to use it with any source tree, and with any child->parent putback/commit operation.
Not just integration or MASTER source trees.