## Ordering Hotels on TripAdvisor as a Minimum Feedback Arc Set problem

### Hello from the TripAdvisor Machine Learning Group

This is the first in a series of blog posts from the Machine Learning Group at TripAdvisor. We get to work on lots of fun, interesting problems, and we thought you might like to hear about them.

## What order should we show our hotels?

Take a Hotels page for a city on TripAdvisor, like this one for Boston.

How should we sort the hotels on this page? Whenever possible, we try to show personalized results on this page, based on the hotels a user has seen in the past. However, a fairly large fraction of people come to the site with no history at all. They are visiting for the first time. What should we show these users? Traditionally, we sorted them by bubble rating, which is essentially a time weighted average of the reviews for that hotel. You end up with a bunch of very expensive, fantastic hotels at the top of the page. While the Four Seasons is a great place, it is probably too expensive for your average traveler. Can we do better? This blog post will talk about another approach, that aggregates the preferences of all previous users to create a hotel ordering for a given city.

## Pairwise preferences

First, we need to figure out a way to extract user preferences from our site. There is a strong position effect, meaning that users are much more likely to click on hotels at the top of the page, no matter what the hotel. Those highly rated hotels have the highest click-through rate, but that is mostly because the often appeared at the top of the page in the past. We need a way of figure out which of the many hotels a user looked at was their final choice. A user can also click off from our site to actually book a hotel. Sometimes users go through the booking process for a few hotels before finding a room and price they like. That’s a good indicator that the user *really* liked a hotel. Say a user clicked from the hotels page to the hotel detail page for 5 hotels: *A*, *B*, *C*, *D*, and *E*. Maybe they clicked on the commerce link for two of those hotel, *B* and *D*. In this case it would be fair to say that the user preferred hotels *B* and *D* to *A*, *C*, and *E*. We could say that the user had 6 pairwise preferences that *B* > *A*, *B* > *C*, *B* > *E*, *D* > *A*, *D* > *C*, and *D* > *E*.

We can look at all of the pairwise preferences generated by all people visiting a hotel page. In aggregate, this should give us really good data for ordering the hotels. The next section will look at how to combine those preferences.

## Coalescing all the pairwise preferences into a hotel order

There will be plenty of conflicting opinion between users. For a given pair of hotels *A* and *B*, we can simply look at the net preference.

If there were 10 users that preferred *A* to *B*, and 4 users that preferred *B* to *A, we can just view that as a net preference of 6 for *A* over *B*. We can view these net pairwise preferences as a directed graph. The nodes of the graph represent hotels, and there is an arc *(A,B)* with weight *w* if there were *w* net preferences for hotel *A* over hotel *B*.

Here is an example with real data for a very small city

We would like to find an ordering of the nodes in the graph. Say hotel *A* comes before hotel *B* in our ordering. Then if there was a net preference for *A* > *B*, this ordering would be respecting the user’s preferences. However, if the net preferences were for *B* > *A*, then we would have it backwards. We want to find an ordering of the nodes that maximizes the weight of the “forward” arcs we get right, or equivalently minimizes the weight of the “backward” arcs we get wrong. This will be the order that most closely respects our user’s preferences.

This problem is known as the Feedback Arc Set Problem. This problem is known to be NP-complete. In fact, it was listed as one of the original 21 NP-problems in Richard Karp‘s initial paper on NP-completeness. We want to find the minimum weight of arcs to remove from our graph that will give us a directed acyclic graph (DAG). For any DAG, there will be an ordering of the nodes such that there are only arcs going from an earlier node to a later one. If our graph has either arc *(A,B)* or *(B,A)* for each pair of nodes *A* and *B* (making this graph a *tournament*), then this problem is also known as the Linear Ordering Problem.

Papers about the Feedback Arc Set Problem seem to be divided into two camps. Papers by people in Theoretical Computer Science aim to provide a known quality of approximation. The Graph Drawing community looks at approaches that tend to be more practical, but without theoretical guarantees. They just need to find a Directed Acyclic Graph so they can layout an attractive graph. We will fall into the “practical” camp, by finding good, heuristic solutions.

## Local Search with Random Restarts to the Rescue

My favorite algorithm for this sort of NP-complete problem is local search. Local search starts from a current solution, and has a neighborhood of adjacent solutions. It sees which adjacent solution improves upon the current solution the most, and then makes that solution the current one. This repeats, hill climbing until no adjacent solution is an improvement. At that point, you have reached a local minima (assuming we’re minimizing).

To do a better job of finding a global optima, we start from a number of random initial solutions, and then choose the best local minima we find along the way. This simple approach is applicable to many combinatorial problems, and often does better than more complicated approaches like Genetic Algorithms.

In our case, a permutation of the nodes (hotels) represents a current solution. Any given solution has a loss function which is the total weight of “back arcs”, representing the total weight of preferences that are not respected by the current solution. We want to minimize this loss function. We swap each pair of nodes to generate the neighborhood of adjacent solutions. I compute the change in loss for each possible swap, and then choose the one with the steepest improvement.

## Fast Updates

One thing that makes this approach feasible even for large cities with thousands of hotels is the fact that you can quickly compute the change in loss dynamically.

Say we want to know the effect of swapping node *x* and *y* where node *x* is in position *i* and *y* is in position *j*, for *i < j*. Let *w(x,y)* be the net preference of hotel *x* over *y*. Let *Z* be the set of hotels in positions in between *i* and *j* in the order.

This is vastly faster than computing everything from scratch each time.

## A heuristic starting point

Usually it is best in local search algorithms to start from a random initial point. In this problem, there is a natural heuristic that makes a good starting point. For each hotel, consider the total weight of the out arcs, minus the total weight of the in arcs. If you sort hotels by this measure, and then do local search, you will typically find a very good solution. Intuitively, a hotel with a lot more “out” weight compare to the “in” weight is a good hotel, so it makes sense to put it toward the front of the order. The local solution does make significant changes to the initial order, but it seems to be a good starting approach. We try this heuristic starting point first, and then generate many random permutations as starting points.

## Smoothing

Sparse data can cause some problem with this approach. You might imagine a hotel *A* that get very little traffic. Say only a single user looked at *A*, and preferred it the hotel *B* that would otherwise be at the front of our ordering. Where will we put *A* in our order? Considering only the loss function, the best place would be to demote *B* from the first slot to the second, and then put *A* first. However, in practice *A* is probably not the best hotel in the city, which would have have been in many thousands of pairwise comparisons.

Clearly we need a bit of smoothing to keep the low traffic from dominating our order by chance. One easy approach is to add a single click of preference between each pair of hotels *(A,B)* that currently don’t have a preference either way. That smoothing click can be based on some other secondary measure. For example you might add a click of preference for the higher rated hotel. Even a single “tiebreaker” vote is enough to move the low volume hotels to a more appropriate place.

## Constraints

There are often other business constraints on the hotel order. We might not want to put a very low rated hotel at the top of an important city, even if the data supported it being there. Once nice property of the local search technique is that it can support arbitrary penalty terms in our objective. Look at the current ordering of hotels, and total up any penalties that may exist. A two bubble hotel at the top of Las Vegas? Give low rated hotels on the first few pages a big penalty, and it will more down to where it belongs. Even if these issues may not happen much in practice, the fact that the penalty is in place allows us to not worry about bad things resulting from a purely data driven approach.

## Results

Here are some plots showing the local search algorithm in action for Paris, a city with 1552 hotels in this data set. The blue line is the local search from the heuristic starting point. The other 11 lines are from random restarts. The local search solutions were all within 0.7% of the best solutions, which was a random restart. The heuristic solution was 0.2% worse than the best.

The implementation used took each hotel in turn, and figured out which was the best hotel to swap it with, saving improving swaps as it goes. We continue these passes through the hotels until there were no improvements in a pass. Because of this approach, the number of possible local search swaps is always a multiple of the number of hotels, or 1552 in this case. You can see that it made up to 8 passes in this case. Clealy stopping after 2 or 3 passes could save computation time without much of a change in solution quality.

With some early stopping, it was possible to process all cities in a few hours of CPU time. As this is an offline operation, that is easily fast enough.

## Conclusions

Pairwise preferences can be a good way to order hotels for users without any personal history. Local search can find solutions that seem intuitively appealing, and easily incorporate other important business constraints.

## Leave a Reply