Author Archives: Pete Carapetyan

Information Theory and the Function of Surprise

Information Theory generally postulates that the value of information can roughly be described as surprise.

In that context, SMOSLT provides value only as it offers the user information that is surprising, or not visible, without it.

Don’t Sell Yourself Short

You’d like to do more.

You’d like for your software development projects to be

better designed
more maintainable
less dependent on obsolete practices and technologies

But you’re also grounded in reality. You know that you can’t even know all the possibilities, much less follow through on them.

This is where SMOSLT can help you visualize what you would not otherwise be able to consider as options.

How To Set Up And Run SMOSLT

Caution to the timid:

As state elsewhere on this site, this application is not for the timid. I have attempted to make it drop dead simple scripted stuff for experienced java programmers. Which means it probably isn’t, just that I got it working on my box perfectly

Pre-Requirements

computer with decent power and memory – whatever that means
*nix shell, preferably a *nix box but at least having something like Cygwin if you are on a Windoze box
Latest java 7 release installed and running on box
Latest maven release installed and running on box
Latest Eclipse Luna installed and running on box, with java7 installed as the default runtime
You will need to have at least minimal java skills and eclipse familiarity
Git installed and running on box
bran new fresh eclipse workspace, such as a directory named […]/smoslt

Optional Pre-Requirements

You will probably want ProjectLibre installed to view the completed schedules
You will surely want Libre Office or Excel another program that is capable of viewing Excel files.

To Install

cd to […]/smoslt or wherever your dedicated smoslt workspace is
clone smoslt.init into your workspace using this command
git clone git@bitbucket.org:datafundamentals/smoslt.init.git
cd into smoslt.init
run source init.sh or ./init.sh or whatever your favorite way to run init.sh as a shell script
This will download 200+ megs into your ~/.m2/repository directory, so it may take a while
I have had to run this twice in a row to get it to run perfectly. (It will only download once, though.)
If you want the full source code for the entire app rather than just the part you want to consume, instead of init.sh above, run initfull.sh
Go to your eclipse workspace and using the import command, import some or all of these projects into your workspace.
At minimum, you will need these:
- smoslt.main
- smoslt.given
- smoslt.mprtxprt

To Run

Find the smostl.main/[…]/smoslt.main/Main class
Run it, using right click Run as Java Application
This won’t cause anything real to happen, but at least it will set up your Run Configuration
Go to the Run Configuration and add these arguments to the run config
Run Configuration > Arguments tab > Program Arguments
-l myRunName -f […pathtoworkspace…]/smoslt.given/src/test/resources/smoslt.pod
where myRunName is any string, and the -f argument is a path to a SMOSLT compliant ProjectLibre file

Saturation vs MoreIsBetter Scoring for SideEffects

Imagine two different types of scoring scenarios

Hours – number of man-hours it takes to complete a complex software creation schedule
ScalesUp – feature of software that makes it scale to near infinite number of users

Both of these are actually Saturation based scoring models, but only theoretically. In practice, there is no hope of ever reducing the number of hours to zero, so really hours is always a MoreIsBetter model – the more efficient the process – in this case the least number of hours, the better.

2863 hours: good
2714 hours: better

But ScalesUp is like most other scoring aspects of creating software. It is a goal which is actually quite attainable in a very finite sense. Once you have built software which can and does deploy to near infinite scale, you are pretty much done with that. It works, and more, in this case is not better. You can’t improve much on perfection. This is the Saturation model, and that model can be easily represented as a 1-100 score, where 100 is the highest score.

score of 0 – 0% complete
score of 70 – 70% complete
score of 99 – 99% complete

We can conclude then that each of these fits the opposite scoring model

Hours: MoreIsBetter model
ScalesUp: Saturation model

Scoring the Saturation Model

This is something that is not obvious until you start the process of scoring for Saturation in a specific score. It takes a different kind of thinking to score against a Saturation model when there are a number of different options that each contribute to a score.

Arbitrary Subtotals

When scoring for saturation, you know that the score can never exceed 100, but you also know that many different activities may be required to meet that goal. This might spread across several options, or in the case of Smoslt: SideEffects as Options

Let’s take an oversimplified example of scalesUp score:

Scalable Software: software written to operate across as many machines as required
Cloud Provider: relationship with a vendor to provide as many machines as required
DevOps: deployment written to recognize and react to increased need

You can’t get to 100 without each of these aspects being complete, so really you have to be able to score each separately. The percentage of score allocated to each is arbitrary, that they should total to 100% is not arbitrary – it is a given.

Scalable Software option: 25%
Cloud Provider option: 17%
DevOps option: 68%

Only now can you begin the process of scoring. For each of the above 3 options, you have the tasks of separately scoring completion against their respective target.

Saturation Based Scoring Within a SideEffect

To explore what it means to score a specific option that affects a saturation score, let’s look at the simpler option of Cloud Provider above. Note first, that this option may affect several different score types, including all of these below, most of which are saturation based scoring models.

Saturation
- ScalesUp
- ScalesDown
- Durability
- ManagerSpeak
- FeedbackSpeed
MoreIsBetter
- Hours
- POLR
- LongTerm

This breakdown may or may not be representative of what really belongs in Saturation based scoring, but let’s take one that is pretty clear – ScalesUp. This is because an app either scales up properly or it doesn’t. It can easily be measured and tested, and it is pretty obvious when it fails.

So now we know that within ScalesUp, repeating from above, each of these options contributes an arbitrarily apportioned part of this score:

Scalable Software option: 25%
Cloud Provider option: 17%
DevOps option: 68%

If we look at only the CloudProvider option, or side effect, we then need to do these things

Set the apportionment of the score at 17%
As the necessary work is complete, increment score against that 17%
When this work is complete, you have fully incremented this score by no more or no less than 17

Another Challenge of the Saturation Modeling: 100%

Saturation is based on an index of 0 to 100%. This sounds perfectly logical, and it can even be logical in implementations. Take ScalesUp for example. If you are using Cassandra for your persistence store and have the appropriate use case, at least for the persistence piece you are at 100%. Cassandra scales linearly right out of the box. Can’t get any closer to 100% than that. So if you have Cassandra in your mix, make your ScalesUp score 100%. Right?

Not so fast there, buddy. Let’s take another look.

Your project has a lot more pieces than just a persistence score. Each of these pieces can wreck a ScalesUp score. If your persistence score scales perfectly, but you have a web tier and a messaging tier too, and they don’t scale up well, then you’re not at 100% yet. So now you have to modify your scoring such that web tier an messaging tier and persistence store together add up to 100%. Is that a third each? Or does web tier get 50% and persistence store 40% and messaging tier get 10%. Good question.

code guidelines/notes

guidelines

avoid anything that requires re-architecting later
do anything that can set aside straightforward additions later

here is my baseline

all projects are MJWA OSGi ready jars, but not running in OSGi container
all my projects do not use carefully architected checked exceptions
- but rather just stupid RuntimeExceptions to fix later
do not do real testing, but do use junit to just get the methods running
do use neil ford’s composed method but only enough to keep my job easier, not religiously

here are my options

my baseline vs other baselines

versus Marcos baseline
versus Matt baseline
versus ….

SMOSLT.main

Command Line Application for SMOSLT

Glossary

PL: ProjectLibre
[compliant]: ProjectLibre file which conforms to exact specifications expected for SMOSLT.stacker – see separate document

Features

import a [compliant] PL file and run SMOSLT.assume against same
something here about two files, one for baseline another for latest something
maybe something here about narratives or options or generating

SMOSLT.stacker [compliant] specifications

all baseline tasks
no automated options (generated by SMOSLT)
resources individually named, with exact group that belongs to
all tasks assigned with either
- specified individual by exact name
- specified group(s) by exact name, and count
- comma delimited

Somewhere need to note that

cartoon of evaluating options process

from situation come up with type of … to look up prototype/template

copy template into my…

modify template to reflect situation

add resources to match task titles

add/modify predecessor relationships

add/modify resources to reflect situation

add/modify options modifiers to reflect situation

rerun to extend out into actual schedule

primary goal is allows you to not have to pick two

keep things fluid
not limiting visibility into options
allow you to have less than complete information

SMOSLT.options – OrGrouping

“This or this or this option, but not more than one from this group”

What Or-Grouping is NOT:

SMOSLT, as it relates to software options, offers a way to evaluate where to commit your limited resources. Should I organize the build around a Continuous Integration server? Or commit those same resources, instead, to deploying my services to smaller linux container modules?

Continuous Integration
Docker Container Service Deployments

These concerns each require a commitment of resources, and I only have enough resources to do one, but neither do they overlap. I could do both, if I had enough resources.

What Or-Grouping IS:

If I decide to commit resources to Continuous Integration, I’m still not done with the comparison of options. That’s where or-grouping comes in. Consider these options for Continuous Integration servers:

Jenkins
Hudson
Thoughtworks Go server
Bamboo

I need to pick whichever one of these options makes the most sense for my organization.

I would never choose more than one of these, it’s an either/or choice. Pick one.

How Or-Grouping Fails in SMOSLT.options module:

The magic of SMOSLT.options is that, unlike it’s human operator, it can compare every combination of options given to it.

Yet this same feature, without or-grouping, has an unintended side effect. For example, it might cause Jenkins and Bamboo to be selected for comparison at the same time! Wrong! The human would know that you either use Jenkins or Bamboo to achieve Continuous Integration, but you would never use both in combination! SMOSLT.options has no way of knowing that, without or-grouping.

How To Use Or-Grouping:

Or-grouping is implemented via naming conventions. Consider again, the same list of candidates for SMOSLT.options to compare:

Jenkins
Hudson
ThoughtworksGo
Bamboo

To implement an Or-Group, we rename this same list as follows:

Ci1-Jenkins
Ci2-Hudson
Ci3-ThoughtworksGo
Ci4-Bamboo

SMOSLT.options module now knows to never evaluate any combination of two or more of these options at the same time. For example, using red bold to indicate selected options, SMOSLT.options would not evaluate the following combination:

Ci1-Jenkins

Ci2-Hudson

Ci3-Thoughtworks Go server

Ci4-Bamboo

Scoring and Inheritance with Or-Groups

Each of these CI server options is more alike, than they are different. Differences between Continuous Integration servers exist, but the big difference is not between them, but between using a CI server and not using a CI server. Again, the list, only this time the name of the java file that does the scoring.

Ci1-Jenkins.java
Ci2-Hudson.java
Ci3-ThoughtworksGo.java
Ci4-Bamboo.java

Scoring each of these means writing each of the above java class, and then copying and pasting the common scoring code into each, and changing whatever is unique after copying and pasting.

We all know the problems of maintaining copy-pasted code. Not good.

So instead, we refactor the above group to add a common super-class. Now the or-group class structure looks like this.

Ci0-ContinuousIntegration.java

Ci1-Jenkins.java – extends Ci0-ContinuousIntegration
Ci2-Hudson.java – extends Ci0-ContinuousIntegration
Ci3-ThoughtworksGo.java – extends Ci0-ContinuousIntegration
Ci4-Bamboo.java – extends Ci0-ContinuousIntegration

Now we can put the common scoring code in Ci0-ContinuousIntegration.java

and the other classes will only contain the scoring code that pertains to that unique server.

Spreadsheet Reporting

The existence of an or-grouping in a SMOSLT.options run alerts the SMOSLT.analytics module that you care about comparing various Continuous Integration options.

So the SMOSLT.analytics module prepares a separate tab in the spreadsheet document, just to compare those options. It names this tab, appropriately, “Continous Integration”

Or-Group Score Summarization?

As mentioned above, you have two primary issues when looking at Continuous Integration for your project.

Should I even do Continuous Integration at all, or devote resources to something else?
If I do, which of the many attractive servers should I choose to implement?

SMOSLT.analytics will prepare a spreadsheet with potentially many sheets to help you with this and other options. As stated above, it will even prepare a tab within that spreadsheet to help you with number 2 – choosing between or-group options

The analytics piece does NOT, however, help you aggregate or summarize Continuous Integration servers for number 1. If you give it 4 or-group options, it will show each individually, making your spreadsheet potentially harder to re read when deciding whether to commit resources to Continuous Integration or some other option. For that reason, you may wish to make a series of separate runs. Try this sequence, for example.

Pick Ci1-Jenkins alone, in your first runs, to compare what happens when you commit resources to Continuous Integration versus committing resources to other options such as Docker Deployments.
Once you’ve decided that Continuous Integration is probably going to be included in your project plan, then make some more runs with each of the rest of the CI 0r-group included. That will let you compare various CI servers and make a final decision

SMOSLT.options – Ordering By Name Feature

Sometimes sequence of options matters.

Example:

Evaluate Continuous Integration
Evaluate …A
Evaluate …B

Might not produce the same comparison as

Evaluate …B
Evaluate …C
Evaluate Continuous Integration

To solve this problem and others – use prefixing

Aa-MyOptionOne

Ab-MyOptionTwo

This makes option MyOptionOne always evaluate before MyOptionTwo

To really explain this you might need a lot more detail about what actually happens in a scoring module, for example the elimination of creation of future tasks.

You also need to explain how this does and does not integrate with or-group naming

You also need to explain that this is all visible in the IDE

You need to explain that they are going to have to expect to refactor names frequently so don’t fall in love with any name prefix like Bjc because it might need to be Kdd later to rework a sequence later

This also exemplifies the higher order rule that says there’s a lot of stuff we do in the IDE that other apps would do in much more refined and elegant ways. We could do them in refined and elegant ways too, but it would require additional levels of abstractions that we’re too cheap to do for free.

You also need to explain that it is permanent, and you can’t adjust it. This both solves, and causes lots of potential problems. You might sometimes wish to adjust option sequence. But to write the program to allow that would have so many consequences that it would break the KISS principle as relates to this software.

SMOSLT Development Status

SMOSLT is free software (free as in free beer) and is supported by a single person.

Development of this software might stop at any minute, should it’s author become otherwise occupied, such as involvement in contracts.

For the moment, it is in active development.