| Structured community portals extract and integrate information from raw Web pages to prescrit a unified view of community entities and relationships. In this dissertation, we argue that to build such portals, a top-down, compositional, and incremental approach is a good way to proceed. Compared to current approaches that employ complex monolithic techniques, this approach is easier to develop, understand, and debug. In this approach, we first select a small set of important community sources. Next, we compose plans that extract and integrate data from these sources using a set of extraction/integration operators. Executing these plans yields an initial structured portal. We then incrementally expand this portal by monitoring current data sources to detect and add new data sources and entities.;To explore this approach, we develop the Cimple software platform, which developers can use to build community portals. Building Cimple raises difficult questions about data and plan modeling, storage, and plan execution. We discuss these questions and detail our choices. We then employed Cimple to build DBLife, a portal for the database community. We found that DBLife could be built quickly and achieve relatively high accuracy using simple extraction/integration operators, and that it can be maintained and expanded with little human effort. We report on our experiments with DBLife, as well as the lessons learned about limitations of the current Cimple platform. The most important lesson is that a Cimple-like platform can significantly benefit from leveraging current RDBMS software and technologies. |