[Arm-dev] Getting started / Build machines

Thu Jul 3 13:49:41 UTC 2014
Gordan Bobic <gordan at redsleeve.org>

On 2014-07-03 14:41, D.S. Ljungmark wrote:
> On 03/07/14 15:33, Gordan Bobic wrote:
>> On 2014-07-03 14:25, D.S. Ljungmark wrote:
>>> On 03/07/14 12:52, Gordan Bobic wrote:
>>>> On 2014-07-03 11:42, Johnny Hughes wrote:
>>>>> On 07/03/2014 05:19 AM, D.S. Ljungmark wrote:
>>>>>> Excellent information, I'd love the scripts, and post-weekend 
>>>>>> sounds
>>>>>> as
>>>>>> if it'd fit well with my schedule.
>>>>>> 
>>>>>> I'll see about taking the time to document steps as well so we 
>>>>>> might
>>>>>> get
>>>>>> a wiki started on how to do this, seems as if there are a few 
>>>>>> people
>>>>>> who
>>>>>> have interest, and at least documenting the basics might be good.
>>>>>> 
>>>>>> Regards,
>>>>>>   D.S.
>>>>>> 
>>>>>> 
>>>>>> On 03/07/14 12:14, Gordan Bobic wrote:
>>>>>>> On 2014-07-03 11:00, D.S. Ljungmark wrote:
>>>>>>>> Thanks for the head's up on that.
>>>>>>>> 
>>>>>>>> So, plan of action would be:
>>>>>>>>  * Find / prepare a F19 bootable image.
>>>>>>> Technically, as Karanbir said, you don't have to run F19
>>>>>>> on the build host, just use the F19 respository for mock
>>>>>>> builds. OTOH, for first pass you may find it a lot faster
>>>>>>> to install F19 (install _all_ packages), and instead of
>>>>>>> mock, use just straight rpm to build the first pass.
>>>>>>> 
>>>>>>> This will save you a tonne of time because the chroot won't
>>>>>>> have to be built every time (it takes time even if it's
>>>>>>> tarred and cached rather than yum installed each time).
>>>>>>> 
>>>>>>> Expect spurious failures if you do that - in EL6 I noticed
>>>>>>> there are packages that fail to build if other packages
>>>>>>> that aren't in the dependency list are installed. This
>>>>>>> is because the package's configure finds the extra
>>>>>>> packages and tries to build against them, which fails
>>>>>>> (or worse, produces a broken binary). If you remove the
>>>>>>> extra package, the build will succeed.
>>>>>>> 
>>>>>>> But for the first pass it should be OK because you
>>>>>>> are only going to use what comes out of it to build
>>>>>>> the second pass.
>>>>>>> 
>>>>>>> Then you rebuild it all again, just to make sure,
>>>>>>> and you should be good for an alpha test, and start
>>>>>>> working on genuine build failures, erroneous arch
>>>>>>> restrictions, etc. It is this stage that takes
>>>>>>> hundreds of man-hours. Everything else is mostly CPU
>>>>>>> time.
>>>>>>> 
>>>>>>> For building with multiple machines, I use a simple
>>>>>>> script on all the builders that places a lock file
>>>>>>> on uncached NFS when a package is picked for build,
>>>>>>> and if a builder sees there's a lock file there,
>>>>>>> goes on to the next package in the list. It's
>>>>>>> trivially simple and works very well. It would be
>>>>>>> nice to have something that resolves all dependencies
>>>>>>> for building and tries to build the packages in the
>>>>>>> dependency tree order, but that's mostly useful for
>>>>>>> bootstrapping from scratch, and we are cheating by
>>>>>>> bootstrapping on F19, so it isn't as big a problem.
>>>>>>> 
>>>>>>>>  * Install mock (git or are the packages ok?)
>>>>>>> See above - you can save a lot of time for the first
>>>>>>> build pass by not using mock. Install all Fedora
>>>>>>> packages, and then simply use:
>>>>>>> 
>>>>>>> rpmbuild --rebuild $package.rpm
>>>>>>> 
>>>>>>>>  * build a mock F19 starter, test compile something traditional
>>>>>>>> (bash?)
>>>>>>>>  * Duplicate this environment to the various machines
>>>>>>>>  * set up nfs for compile target
>>>>>>>>  * wrap some scripts around pssh to do parallel builds
>>>>>>>> 
>>>>>>>> -- Am I missing something major here?
>>>>>>> That's pretty much it. I am happy to share the scripts
>>>>>>> I use. If I don't post them by the weekend ping me
>>>>>>> to remind me. I can't get to them right now because my
>>>>>>> build farm is behind I firewall I don't have a hole on.
>>>>> 
>>>>> I would be happy to setup mock for you if you want ... or even just
>>>>> put
>>>>> what I have been using for mock configs here on this list for a 
>>>>> test
>>>>> build.
>>>>> 
>>>>> I personally would rather produce the RPMs via mock as that 
>>>>> prevents
>>>>> pulling in spurious links for packages because the buildroot is too
>>>>> fat
>>>>> (ie, in RHEL, package-y.x.z does not link against package-a.b.c
>>>>> because
>>>>> it is not a BuildRequire and not installed in the mock build root 
>>>>> ..
>>>>> but
>>>>> if run with the package-a.b.c in the buildroot, the configure 
>>>>> process
>>>>> checks for and links against it.
>>>> 
>>>> Indeed, I touched upon that in my previous post, but that doesn't
>>>> really matter too much for the first build pass, as you are going
>>>> to rebuild everything again anyway. And you can pick off the few
>>>> build failures arising from too much junk in the build environment
>>>> before the second pass via mock rebuilds.
>>>> 
>>>>> All mock does is build a minimum clean build root for each package
>>>>> where
>>>>> only the specific requires for building are in the build root so 
>>>>> that
>>>>> each package gets only what it needs to build and builds are more
>>>>> consistently.
>>>> 
>>>> Sure, but the time it takes to rm -rf the build root and then untar
>>>> a cached build root copy is a non-trivial fraction of the build
>>>> time for a lot of the packages. It is certainly not trivial when
>>>> you multiply it by around 2,000 for the number of packages you are
>>>> going to need to build.
>>>> 
>>>>  From experience, anything you can do to get past the first stage
>>>> build faster is usually a good idea if hardware is limited - and
>>>> on ARM it usually is. Even on something like the Arndale Octa or
>>>> the new Chromebook which have 3-4GB of RAM and 8 cores, building
>>>> takes a while. It's not like throwing a full distro rebuild at a
>>>> 24-thread 96GB of RAM Xeon monster and being able to expect it to
>>>> have completed by tomorrow morning.
>>>> 
>>>> 
>>> 
>>> Oh trust me on that one, I know how this is, and the boards we have
>>> aren't monsters. (but we have a fair amount of them;)
>>> 
>>> Ex Gentoo dev here, I know -everything- about obnoxious build times 
>>> and
>>> how amazingly painful it can be. I suspect this will be a lot like
>>> going
>>> back to an old thunderbird CPU  with PATA drives.
>> 
>> It's not _too_ bad. The biggest package in the distro is LibreOffice,
>> and that takes about 24 hours to build on an Exynos A15 Chromebook.
>> It would build in half that time, but the build process goes out of
>> it's way to only fork one thread for some reason.
>> 
>>> Crossing it off onto many different machines might help. Not sure if
>>> there's a way to hook btrfs or lvm snapshots into the build/restore
>>> process ( I don't understand it enough yet to test ) but those
>>> certainly
>>> won't work over network filesystems.
>> 
>> Indeed, and not working over NFS means you can only use builders
>> with local SATA or at a push USB->SATA disks. Having said that,
>> there are packages that fail to build on NFS due to test failures.
>> 
>> 4GB of RAM is the biggest limitation. What I tend to do is attach
>> a decent, large SSD (LibreOffice needs nearly 40GB to build!!!)
>> to each builder, set up tons of swap (say, 8GB per core), and run
>> as many build threads as there are cores in the machine. That
>> tends to yield optimal hardware saturation even if some
>> packages insist on building single-threaded.
> 
> 
> That might be an issue to check, I'll probably end up swapping anyhow 
> as
> the boards aren't gloriously drowning in RAM.
> 
> Ach well, more stuff to look at and keep in mind, good knowledge to 
> have
> though!
> ( next up, performance test of the boards that swap over NFS vs. those
> that swap over USB=>sata SSD.  Based on previous numbers, I guess swap
> over NFS will actually be faster.  That's for monday though, as so many
> other things )

Does swapping to a file on NFS even work reliably any more? I don't
think it does, unless something changed recently. It certainly didn't
with the kernels I used (which are, granted, quite old), and neither
did swapping on any kind of a network attached block device, because
parts of the networking stack can end up getting swapped out and the
machine locks up because to unswap them, it needs the networking
stack to get to the block device.

Either way, you are likely to be sufficiently CPU constrained that
swapping to USB->SATA->SSD should be good enough.