How Would You Design a Vacuum Cleaner?

Note: this post is about Object Oriented design, not vacuum cleaners or AI.

I’ve recently got the AIMA book, which is probably the most fundamental textbook on AI. Today I set out to complete excercises for Chapter 2, “Intelligent Agents.”

Majority of the excercises are about designing an intellectual vacuum cleaner in various environments under different performance metrics. Initially seemed trivial, they proved to make me strain my software architecture skills.

The task is to program an agent (vacuum cleaner, VC for short) and evironment, defined as follows. The environment consists of only two locations, A and B. There may be garbage in either of locations. The VC can perform four actions:

  • Left (go from B to A if at B, else no effect),
  • Right (go from A to B if at A, else no effect),
  • Suck (clean the location if dirty),
  • NoOp (no action).

The VC should be implemented as function called by, say, Environment, which returns the action to be performed next. The VC’s sensors allow it to know where it is and if there is any garbage. Actions, btw, are performed by VC’s actuators.

First excercise is to implement the VC agent acting according to folowing logic:

    public Action act(Sensor sensor) {
        if (sensor.isDirty()) {
            return Action.Suck;
        } else if (sensor.getPosition() == Environment.POS_RIGHT) {
            return Action.Right;
        } else if (sensor.getPosition() == Environment.POS_RIGHT) {
            return Action.Left;
        }
    }

Seems trivial, isn’t it?

In fact, it isn’t, provided that you must implement the thing in fully modular way, where you could change agent function or performance metric, or actuators or sensors (say, to model a broken sensor) independently.

I’ll tell you how did I solve this problem.

First of all, there be VacuumCleaner, Environment and PerfMeter objects. Environment is responsible for running the VC and for letting PerfMeter collect its data:

    public void Environment.run(
            int forTime,
            VacuumCleaner agent,
            Actuator actuator,
            Sensor sensor,
            PerfMeter meter) {

        meter.reset(this);
        for (int i = 0; i < forTime; i++) {
            sensor.sense(this);
            Action action = agent.act(sensor);
            actuator.performAction(action, this);

            meter.examineEnvironment(i, this);
            examineEachPosition(meter);
        }

    }

Therefore PerfMeter acts like Visitor from Design Patterns, and I think a can refer to agent as ’stateful strategy’… (Haven’t I said that a strategy must be stateless — to be called from several places simultaneously?)

Sensor is an intermediary object that is takes data from Environment and passes it to the agent, probably disfiguring that data to simulate broken mechanism.

Actuator is another intermediary object which effects agent’s will. Therefore it can prevent our VC from doing something, like Sensor can prevent from knowing something.

This approach worked rather well for environment with two locations, but what I am to do if I also want to create an environment representing two-dimensional surface? I obviously want to subclass the Environment because I have to store location data differently. And how to refer to positions? I used an int to denote 1D position, but now I want to pass two numbers — x and y coordinates of the VC.

The solution comes from the book Refactoring by Martin Fowler. We may use ‘Replace Data Value with Object’ and index our location maps with Position object, and this won’t break our old code if we introduce two constant positions, Left and Right (or A and B).

The code works but still that code ’smells’ because sometimes subclasses often owerride too many methods of their ancestors, it would be better to extract some interfaces… But it just doesn’t matter.

Comments (1)

Why is Cliche worth looking at?

This was actually written as part of (yet) unpublished Cliche Manual. It will be available at cliche.sourceforge.net.

Cliche is a small (under 100 kilobytes, no dependencies), Java library enabling really simple creation of CLIs (CLI stands for “command-line user interface” throughout all Cliche docs.)

How simple? So simple:

package asg.cliche.sample;

import asg.cliche.Command;
import asg.cliche.ShellFactory;
import java.io.IOException;

public class HelloWorld {

    @Command // One,
    public String hello() {
        return "Hello, World!";
    }

    @Command // two,
    public int add(int a, int b) {
        return a + b;
    }

    public static void main(String[] args) throws IOException {
        ShellFactory.createConsoleShell("hello", "", new HelloWorld())
            .commandLoop(); // and three.
    }
}

Three additional lines of code! More precisely, one per each method to be available to user, and one “magic phrase” to run the Shell. Well, that should produce really crappy UI…

hello> ?list
Name    Abbr    ArgNum  Descr
hello   h   0   hello() : String
add a   2   add(p1:int, p2:int) : int
hello> hello
Hello, World!
hello> add 45 34
79
hello> a 45 34
79
hello> exit

Not so bad, I suppose. As you should have noticed, there are some “builtin” commands, such as ?list, they are prefixed with some special symbol, e.g. ? or !.

This library proved to be very helpful in experimenting with new code. E.g. you’ve spent an hour writing new algorithm and want to quickly test it. Instead of spending another hour on writing any kind of UI, you just add a couple of annotation and that magic line and hit Shift+F6, and quickly learn how that code works. Or you’ve got a third-party library and want to test it “hands on”.

Personally I am very happy that I have Cliche, because now I’m experimenting with algorithms a lot, and I spend no time constructing the UI.

Or there may be “real” applications, since the very core functionality of it, i.e. type conversion and object output, is easily extensible. And it supports subshells, that give us the important ability to navigate through tree-like structures.

One more use case is when you have rather complicated system, say, MVC-style webapp, and you want to give the sysadmin full control over the model, but it’s too expensive to build a web admin panel for only one man… Then you’ll probably just have to write a simple Main class and maybe some type converters in the worst case.

In many cases we prefer simplicity, and I don’t think there’s simpler UI framework for Java.

Isn’t this small thing interesting?

Leave a Comment

Web action methods

I still think that JavaServer Whatever is overcomplicated, and now I want to share my thoughts about JSF action methods.

I’ve already said that they should have parameters, and now I want to say that they should have return value. Now they return string that says JSF engine which page to display next (i.e. View ID). But in most cases that’s not the result of performing an action! Of course, when you click “home” page ransition is the action, but when you submit a comment, for example… Well, then you probably don’t need your View ID to be changed, so no transition is the result. Bad example.

Search action is probably the best example. You take a string and return a table of results. Therefore your action method can be written as follows:

SearchResults search(String request) { . . . }

And you probably have exactly this method in your model, but in you view you need a page with table layout, table controller, whatever.

Isn’t it a good thing to return a big object to our framework, and give it a converter to transform the object into somewhat displayable? No duplicated pages for similar objects, possible view customization by changing/deriving from the converter.

I’ll call it “thin view” — view methods are very close to corresponding model methods. I think it is possible to implement this approach quite well, and there are some areas where it would be especially favorable. E.g. in monitoring aplications, where you ask the machine to visualize various usage statistics: by implementing new converters you could easily change the representation.

There’s PHP

We actually want to build on more layers of abstraction to get to the lower layer! A friend of mine, working mostly with PHP, helped me understand this issue.

What do we have there? HTTP request, which looks exactly like a function call (host.com/add.php?a=1&b=2 <=> com.Host.add(1, 2)), and it is possible to structure code in a way when you have a method returning, say, rather complicated dictionary-like structure and rather simple function that transforms this structure to HTML. I don’t think full separation of code and, uhm, other (html) code is always necessary.

Don’t start learning web from JSF. Try PHP.

Leave a Comment

Красота

Что есть красота? На этот вопрос мы врядли дадим ответ… Я бы назвал чувство красоты фундаментальным, и не стал бы его определять.

Другой вопрос, что общего у всех объектов, которые мы считаем красивыми? Сет Годин рассуждает, что красивы объекты, в которые созданы с большим трудом, которые требуют уникальных навыков, чтобы быть созданными, или просто очень дороги (что тоже неспроста). То есть красота как мера силы. Красиво то, что есть не у всех.

Что касается людей или животных, то здесь красив прежде всего здоровый организм. Но здоровье — это мера силы, конкурентоспособности, так же как стоимость и уникальность вещи показывают силу и превосходство ее обладателя.

А теперь зададимся вопросом: что общего у этого понятия красоты с понятием красоты, употребляемом, когда мы говорим о красивом доказательстве теоремы, красивом решении задачи, красивом коде? Красиво прежде всего короткое решение, краткий и легко читаемый код. Поэтому я заключаю, что вторая важная сторона красоты — простота.

Приглядимся к тем самым дорогим и/или уникальным объектам, например, произведением искусства. Среди художников и скульпторов известен принцип: “Шедевр — не то, к чему нечего прибавить, а то, от чего нечего отнять”. Имеется в виду, отнять без потери смысла. А это та же простота. Отсутствие нефункциональных деталей.

То есть красота определяется сочетанием силы и простоты. Под силой понимается всякое конкурентное преимущество, а под простотой — отсутствие нефункциональных частей. В этом смысле простота ≠ примитивность. Без постоянного упрощения наука стояла бы на месте. Да какое на месте — наука не могла бы возникнуть! Задача любой науки — описание сложного поведения реальной системы наименьшим числом параметров. Без стремления к упрощению никогда не перешли бы от геоцентрической системы к гелиоцентрической, Максвелл никогда бы не описал поведение всего сущего четырьмя уравнениями. Основная функция мозга — обобщение. А без упрощения нет и обобщения!

Красота — двигатель прогресса.

Да хотя бы сколько подвигов было совершено из-за женской красоты :)… Но это уже немного другая история.

Comments (1)

Properties are like global variables

As I decided that we probably aren’t ready to implement Naked Objects GUI, I started to think what can we do to fundamentally improve our development experience with J2EE, JSF namely. I’m new to this concept (and to webapps in general; in fact, I think it was bad idea to start learning web from J2EE).

The first thing that makes me feel bad is overusage of properties. First of all, they spawn lots of identical code, and it’s real pain to rename a property as NetBeans don’t have support for changing field+getField+setField names at once. That’s not the worst, of course. Do you remember everybody saying “global variables are bad”? But aren’t these properties existing only to set some action’s parameters like global variables? I think this is exactly like the following:

class SomeClass {
    . . .
    int augmend;
    int addend;
    public int add(int a, int b) {
        augmend = a;
        addend = b;
        return augmend + addend;
    }
}

No good at all. Of course we should write

    public int add(int a, int b) {
        return a + b;
    }

Why use such complicated scheme in JSF?

Somebody might say: “Properties allow us to validate values!” Man, we have JSF inputValidators exactly for this purpose! They’re better in most cases: after all, they were introduced deliberately. Of course there are some cases when we should validate values inside properties, mot they’re quite rare.

Wouldn’t it be nice if our action methods had parameters?

<h:inputText id="userName" required="true"/>
<h:commandButton action="#{user.login(userName)}" value="Login"/>

No empty getter/setters around, data flow crystal clear and less JSP inside HTML.

Maybe there is such framework? I’m looking for it.

Leave a Comment

Declarative Approach to UI: Naked Objects

I’ve googled a bit, asked some questions on SO, and learned that the approach I proposed earlier has a name: Naked Objects pattern. There are several frameworks implementing the concept, most notably the Naked Objects framework, which is the most universal and “pure” implementation.

I’ve played a bit with this framework and was quite disappointed. As people answered on Stack Oveflow, people tried to implement some variety of NakedObjects-like framework, but none widely known because most of these attempts failed:

All attempts I have seen always end up with user interfaces that even the creator have problems liking. It also seems like there’s no amount of meta-data you can add that will make these UI’s nice. (answer by krosenvold)

And, in the case of Naked Objects framework this statement seems to hold true…

Naked Objects framework demo UI

Naked Objects framework demo UI

shot-2009120-180748-supervisor-javaexe

Naked Objects framework demo UI

shot-2009120-180929-supervisor-javaexe

Naked Objects framework demo UI

First of all, I have absolutely no idea what to do. Finally it turned out that most operations are to be performed from context menus. And their context is sometimes strange. I think it may be rather nice framework, if you use it correctly, but the very approach to constructing the UI is rather outdated. People need rich, usable UIs, where they can do more work in less time.

I think today good implementation of Naked Objects is impossible, because computers are too stupid to be comparable with work of even not very good UI designer. And to emphasize,

computers are too stupid.

A program is created to solve a problem. A concrete problem, a set of problems, a wide class of problems. If the program is designed to aid in solving more than one problem, user must perform different sequences of actions to accomplish his task. Some actions are performed together more frequently, others are simply needed rarely. So, actions or commands are not created equal. And to construct a decent UI we must take into account all these differences. No amount of metadata is enough for a machine to layout nice UI. Well, some amount will be enough, but it will be amount of additional data exactly equal to those in manually designing the UI; in such case our framework would be useless.

Now we note, that this argument about layout is applicable only to graphical interfaces. In text-based, conversational UIs, like bash or BeanShell you don’t have to worry about one button sitting in inconvenient place — but you have to remember all commands. Command-line UIs are much more versatile, but sometimes too complex for the user…

Wait, if the complexity of command line is acceptable, why can’t we apply the Naked Objects pattern in constructing them? There’s really no reason, and everything is in fact already done! Take PowerShell or BeanShell. Just have domain-specific classes and operate them directly, in any way you want. Sometimes it’s good, but in most cases too complex. So you sit down for a couple more hours and write your own CLI. But we can use Naked Objects and autogenerate parser, help system, formatter, script support, whatever you want. I’ve made some steps in this direction in aforementioned Cliche library. It proved useful, I already use it in 3 small programs, and I’m satisfied. Based on this experience, I want to say that although the future of Naked Objects may be bright, today console interface is the single thing where it works really well.

Leave a Comment

MPS: языко-ориентированный подход к разработке

Где-то полгода назад, я наткнулся на вещь, которая меня очень заинтересовала — MPS, Систему МетаПрограммирования от JetBrains. Утверждается, что это реализация совершенно новой идеи среды разработки для языко-ориентированного программирования (w: Language-oriented Programming; далее LOP).

Суть LOP в том, что для решения задач используются языки предметной области (DSL; Domain Specific Language), а если такой язык не создан, то он сначала создается. Например, если мы решаем задачу о поиске какого-либо шаблона в тексте, мы используем регулярные выражения, а если мы решаем, скажем, какую-нибудь задачу по популяционной динамике, то мы не пишем сразу код на фортране, а создаем язык для описания задач подобного типа, и решаем задачу уже с помощью высокоспецифичного языка. Очень подробно основные идеи LOP и зачем оно нужно объясняет Сергей Дмитриев (один из основателей JetBrains) в статье Language Oriented Programming: The Next Programming Paradigm (в т.ч. pdf).

Собственно, MPS, насколько я понимаю, — это просто среда разработки для создания таких вот высокоспецифичных языков. Как показано в руководстве, MPS включает поддержку различных языковых модулей, позволяющих не создавать язык с нуля, а собирать его из готовых блоков. Также использована очень интересная идея отказа от текстовой формы представления кода: все манипуляции производятся непосредственно с синтаксическим деревом, со структурой кода, а не с текстом программы. Это позволяет добиться очень высокой гибкости и расширяемости языка: в принципе, можно хоть видеоклипы непосредственно в код вставлять, если это будет нужно для какой-то задачи. Конечно, видеофайлы — это перебор, а вот встроить туда человеческий редактор графов не помешало бы, а то мне уже немного осточертело вручную вбивать матрицы перехода конечных автоматов :)

Кто-то, конечно, скажет, что идея LOP не нова, но нова поддержка LOP средой разработки — да не какой-то, а от JetBrains (MPS построена на каркасе всем известной IntelliJ IDEA). Так что стоит с ней поразбираться, хоть это и весьма непросто: ни с первой, ни со второй попытки у меня это не вышло, но сейчас ситуация выглядит получше: наконец-то вышла бета-версия, к выходу которой наконец написали достаточное количество документации.

Leave a Comment

Цели и задачи

Что это?

Это мой “полупрофессиональный” блог, если можно так выразиться: пишу я тут в основном об основной области моих профессиональных интересов — о разработке программного обеспечения в среднем, о проблемах бытия в общем и о пользовательских интерфейсах в частности (как о весьма остро стоящей проблеме бытия).

Зачем это всё?

Надо учиться писать, пока не поздно. А вести технический блог, да еще и на двух языках мне кажется отличной возможностью и в писательстве потренироваться, и язык подучить. Ну и дисциплину в каком-то смысле выработать: ведь в блогописательстве главное — придерживаться расписания (а у меня оно где-то 3 записи в неделю, по-русски в среднем одна).

Кто я?

Я Антон Григорьев, студент третьего курса факультета ФМБФ МФТИ, уж несколько лет увлекаюсь программированием, когда-то сильно интересовался биологией, потом физикой, но это было давно и неправда :)

Вы всегда меня можете достать по емейлу ansgri@gmail.com.

И еще о языках

Пишу я по-русски, но читаю в основном по-английски, потому что в англоязычном сегменте интернета информация обычно и свежее, и подробнее, и найти ее проще, чем в рунете. Поэтому прошу простить меня за то, что почти все ссылки будут на материалы на английском языке.

Итак, приступим.

Leave a Comment

Declarative Approach to UI: Cliche library

UI is hard, but, nevertheless, can we make a machine create UI for us? There are several well-thought UI concepts, that make UI construction almost mechanical task, if so, wouldn’t it be nice if the machine could generate not very specific, but consistent and bug-free UI for us?

I first asked myself this question when I was about to start designing a small console application. I was to make an administrative utility for a small web app, and I thought admin web-interface would be overkill. So the simpliest solution was simple console application. But how simple? There already were several administrative functions in the database support class, so I would need almost no code… Except the UI. Whatever I need, either non-interactive or interactive CLI, I’d have to write a load of code to either parse the command line or, ahem, parse user’s interactive commands. You probably see the added complexity.

But this complexity is so unnecessary and the code is so boring, so I started to think how to make the machine do the boring work. But before solving a problem you must be sure that there is a problem, right? I asked a question on Stack Overflow:

Do you write interactive console applications? Do you think it should be even easier than already is?

Answers weren’t rather inspiring, but I decided the task is worth solving, mostly because I knew that I’ll have to write at least three more apps of this kind (projects on CompMath).

In half day of work I’ve finally designed the solution: Cliche CLI library (of course, getting to present state took another 2 days…). The solution is based on Java Reflection and is fairly straightforward: you feed Cliche an object, it discovers its annotated public methods and builds a command table, in which the Shell looks up the method based on command entered by user. I tried to explain the functioning of Cliche in this document.

Using this apprach I not only managed to commodify command loop and parsing code, but type conversion too. I.e. it works well with methods like int add(int a, int b) and even double sum(double... xs). Of course you can extend Cliche with new type converters.

This example illustrates the power of metadata and reflection, but also brings us to a question:

Can this method be applied to GUI auto-generation? How wide can it be used? And, of course, what other software uses analogous approach? Should we try to make one?


P.S. I’m sorry if the language was somewhere incorrect (I’m not a native speaker), please comment on grammar, etc. errors.

Leave a Comment

About

Hello!

Finally I’m pretty confident that I’m ready to start a blog. I have some time and some ideas, so here won’t be empty pages :) Blog posts will reflect my primary interests: software development, computers in general and user interfaces in particular.

Why Blog?

I’ve been reading Jeff Atwood’s Coding Horror blog (which is I think kind of must-read) and I’ve considered his reasons for starting a blog to be pretty convincing.

If anything, what I’ve learned is this: if I can achieve this kind of success with my blog, so can you. So if you’re wondering why the first thing I ask you when I meet you is “do you have a blog?” or “why don’t you post to your blog more regularly?”, or “could you turn that into a blog post?”, now you know why. It’s not just because I’m that annoying blog guy; it’s because I’d like to wish the kind of amazing success I’ve had on everyone I meet.

And, btw, I think it’s rather good way to practice in English, though here will be also posts in Russian: I’ll use WordPress categories feature to keep them separated.

Who Am I?

My name is Anton Grigoryev, third-year student of MIPT (Moscow Institute of Physics and Technology). My for last several years has been software development, but now it becomes my profession. I’m from Petropavlovsk-Kamchatsky, that is in the Far East of Russia, but now I live in Dolgoprudny, near Moscow. You definitely don’t want to know all this, but I just have to say it for completeness :)

You can always reach me via email: ansgri@gmail.com.

P.S. My schedule is 2 posts a week.

Leave a Comment