Skip to content
This repository has been archived by the owner on Jun 17, 2020. It is now read-only.

aanciaes/sd-project16-17

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Systems

Project - phase 1

Engenharia Informática

Ano lectivo: 2016/2017, 2º Semestre

Index

Deadline: April, 13th, 23h59
Report deadline: April, 28th, during test 1 (or in the DI secretary)

Goal

The goal of the project is to create a system for indexing and searching information about (external) documents. 
A document will consist in the following information: an URL and a list of keywords.
The system will allow a client to:

  • Add documents to be indexed;
  • Remove documents from the index;
  • Search for documents, given a set of keywords.

Does this service make any sense?

Indexing services are useful and used in a number of scenarios. For example, operating systems like Windows and Mac have indexing services that help searching for files based on their contents (e.g. Windows search, Spotlight). Document repositories, in addition to providing document storage and retrieval, also maintain indices for helping searching for information on the documents stored (e.g., Apache Solr can be used to build such systems... and more complex ones :-). In the former example, the system you will be building could be used for indexing information about files by adding each file to the index whenever it has changed - the URL would be the URL of the file in the filesystem (file://...) and the keywords would be the words present in the files. A similar approach could be used for the second example, replacing the notion of file for document. Likewise, you can also use the system you are building for indexing web pages, or any other documents that have an URL and for which you can identify a set of keywords.

 

Components of the system

The system must contain, at least, the following components:

Rendez-vous Server
The rendez-vous server maintains a list of indexing servers. The REST interface of this server should be the following:

@Path("/contacts")
public interface RendezVousService {

    @GET
    @Produces(MediaType.APPLICATION_JSON)
    Endpoint[] endpoints();

    @POST
    @Path("/{id}")
    @Consumes(MediaType.APPLICATION_JSON)
    void register( @PathParam("id") String id, Endpoint endpoint);

    @DELETE
    @Path("/{id}")
    void unregister(String id);
}


Indexing servers
Each indexing server maintains indexing information.

The REST interface of this server should be the following:

@Path("/indexer")
public interface IndexerService {

@GET
@Path("/search")
@Produces(MediaType.APPLICATION_JSON)
List<String> search( @QueryParam("query") String keywords );

@POST
@Path("/{id}")
@Consumes(MediaType.APPLICATION_JSON)
void add( @PathParam("id") String id, Document doc );

@DELETE
@Path("/{id}")
void remove( @PathParam("id") String id );
}

ADDENDUM 29/3/2017:

       The SOAP interface of this server, is the following:

package api.soap;
 
@WebService
public interface IndexerAPI {
   
   
@WebFault
    class InvalidArgumentException
extends Exception {

       
private static final long serialVersionUID = 1L;

        public InvalidArgumentException() {
            super("");
        }       
        public InvalidArgumentException(String msg) {
            super(msg);
        }
    }

    static
final String NAME="IndexerService";
    static final String NAMESPACE="http://sd2017";
    static final String INTERFACE="api.soap.IndexerAPI";

    /* keywords contains a list of works separated by '+'
     * returns the list of urls of the documents stored in this server that contain all the keywords
     * throws IllegalArgumentException if keywords is null
     */
    @WebMethod
    List<String> search(String keywords) throws InvalidArgumentException;

    /*
     * return true if document was added, false if the document already exists in this server.
     * throws IllegalArgumentException if doc is null
     */
    @WebMethod
    boolean add(Document doc) throws InvalidArgumentException ;

    /*
     * return true if document was removed, false if was not found in the system.
     * throws IllegalArgumentException if id is null
     */
    @WebMethod
    boolean remove(String id) throws InvalidArgumentException ;
}


NOTES:

  1. To allow clients to distinguish between REST and SOAP service instances, the endpoint of the server should should be registered at the RendezVousServer with attibutes that include the key "type" with "rest" or "soap", respectively. In the absence of the "type" key, the client will assume the server is a REST server.
  2. The first argument of the indexer, if present, must be the url of the rendezvous server -- the test program will start the indexer with the correct parameters.

Access pattern to server

The indexing service will be used by clients according to the following access pattern.

For adding information for a document, a client will: (1) contact the rendez-vous server to get a list of indexing servers; (2) select one of the indexing servers and invoke the add operation.

For removing information of a document, a client will: (1) contact the rendez-vous server to get a list of indexing servers; (2) select one of the indexing servers and invoke the remove operation.

For searching for information stored in the system, a client will: (1) contact the rendez-vous server to get a list of indexing servers; (2) select one of the indexing servers and invoke the search operation.


IMPORTANT
: In phase 1, each indexing server only needs to be able to return information for documents that have been added to that server.


NOTE: We will provide the following components, to be used in the system being developed and for testing it:

  • a library for indexing, supporting an interface similar to the interface of the indexing server, which stores information locally in a node;
  • a test program that will execute a sequence of operations and check if the returned results are the expected ones. You should not change the code of the test program.


Functionalities

Up-to date rendez-vous server [2 points]

The rendez-vous server must maintain up-to-date information about indexing servers. To this end, the information about an indexing server must be discarded if the servers stops.

Automatic discovery of the rendez-vous server [2 points]

It should be possible to automatically find the rendez-vous server. To this end, the rendez-vous server should reply to a multicast request with message "rendezvous" with a string with the URL of the rendez-vous server. The multicast address and port used by the server can be selected freely.

Base REST [7 points]

This consists in the complete system, composed by the rendez-vous and indexing servers, communicating using REST.

As a result of this option, you should have a working system consisting in a REST-based rendez-vous server and a set of REST-based indexing servers.

Each indexing server only needs to be able to return information for documents that have been added to that server. However, if a remove for a given document is invoked in a server, the information for that document should be removed independently of the server where it is indexed.

Base SOAP [6 points]

This consists in implementing the indexing server using SOAP. It is optional to also implement the rendez-vous server in SOAP or to use the REST version. The exact interfaces that the servers must implement will be introduced in lab 3.

As a result of this option, you should have a working system consisting in a rendez-vous server (either using REST or SOAP) and a set of SOAP-based indexing servers.

Each indexing server only needs to be able to return information for documents that have been added to that server. However, if a remove for a given document is invoked in a server, the information for that document should be removed independently of the server where it is stored.

Base SOAP+REST [3 points]

This consists in having REST and SOAP indexing servers capable of working together.

As a result of this option, you should have a working system consisting in a REST-based rendez-vous server and a set of indexing servers, some working in REST and the others working in SOAP.



Notes on faults

    Regarding failures of the components, you must assume:
  • the rendez-vous server will not fail;
  • indexer servers may fail permanently (fail-stop model) -- note that this will connections to the server to fail.

Regariding communications, you should assume that communication may fail temporarily.


Environment

IMPORTANT: The project must be demonstrated in the labs, with servers running in at least two computers/containers, either using existing hardware or student's hardware.

Your system will be tested using the test program provided in this link, which is divided in steps that test the different functionalities of your program -- you should use the client to check the progress of your project as you add new functionalities to your work.

The grading of your project will take into consideration the tests passed by your system -- so , you should guarantee that your systems passes as many test as possible (projects will be accepted even if they do not pass all tests).


Written Report:

A written report must be delivered by each group describing their work and implementation. The report should have at most 4 pages (any code that is found relevant should be delivered as an appendix that goes beyond the 4 page limit).

The report must cover the following topics.

  • General description of the work performed by the students, clearly identifying which aspects were completed and fully implemented.
  • Limitations of the delivered code.
    Students should include as annex a table that specifies which tests their code passed. For the failed tests, students should indicate whether the test has failed because the tested functionality was not implemented or because it had a bug.
  • Interfaces of the servers (both SOAP and REST).
  • Clear explanation of the mechanisms (i.e, protocols) employed for:
    • Discovery of the rendez-vous servers.
    • Keeping the rendez-vous server up-to-date.
    • Handling of faults.
  • Discussion of the implementation decisions taken by the students, when applicable, discussing these decisions in light of possible alternatives (this should include how operations are executed, with focus on those that the implementation in non-trivial).

The report can also cover aspects related with difficulties felt by the students during the execution of the project or other aspects that the students consider relevant.

Delivery Rules:

The code of the project should be delivered in electronic format, by uploading a zip file that includes:

  • all source files (src directory in the project)
  • the sd2017-t1.props file
  • the pom.xml file

Use this **** link **** to deliver your work (NOTE: you must login with your @campus account).
To keep the size of the zip archive small, zip full eclipse project minus the target folder that maven generates with the compiled classes and downloaded dependencies.
IMPORTANT: The name of the zip archive should be: SD2017-T1-NUM1.zip or SD2017-T1-NUM1-NUM2.zip

NOTE: You may deliver the project as many times as needed.