Google Search Appliance Guide to Software Release 7.2 Google Search Appliance software version 7.2
Google, Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043 www.google.com GSA-REL_100.07 December 2013 © Copyright 2013 Google, Inc. All rights reserved. Google and the Google logo are, registered trademarks or service marks of Google, Inc. All other trademarks are the property of their respective owners. Use of any Google solution is governed by the license agreement included in your original contract. Any intellectual property rights relating to the Google services are and shall remain the exclusive property of Google, Inc. and/or its subsidiaries (“Google”). You may not attempt to decipher, decompile, or develop source code for any Google product or service offering, or knowingly allow others to do so. Google documentation may not be sold, resold, licensed or sublicensed and may not be transferred without the prior written consent of Google. Your right to copy this manual is limited by copyright law. Making copies, adaptations, or compilation works, without prior written authorization of Google. is prohibited by law and constitutes a punishable violation of the law. No part of this manual may be reproduced in whole or in part without the express written consent of Google. Copyright © by Google, Inc.
Google Search Appliance: Guide to Software Release 7.2
2
Contents
Guide to Software Release 7.2 ..................................................................................... 4 About “Beta” Features New and Changed Features New GSA Version Manager Admin Console 2.0 Onboard Group Resolution Sort Results by Metadata Support for HTTP POST Requests Wildcard Search Language Improvements Trusted Applications New Entity Recognition Features New Crawl Configuration Features Connectors 4.0 (Beta) Connectors 3.2 Deprecated Legacy Security Release 7.2 Documentation Specifications and Usage Limits Supported Browsers and Third-Party Software
Google Search Appliance: Guide to Software Release 7.2
4 4 5 5 9 9 9 9 10 10 11 11 11 11 12 12 12 13
3
Guide to Software Release 7.2
This document describes features that are new or changed since version 7.0. Google Search Appliance version 7.2 updates are now available for the GB100, GB500, GB-7007 and GB9009. For a list of new, fixed, previous, and closed issues for software release 7.2, refer to the release notes. Both the software updates and the release notes are available at the Google for Work Support site, https://google.secure.force.com (login required).
About “Beta” Features Features marked as "Beta" are provided as a preview, and are not supported. Beta features have constraints and are not recommended for production usage at this time. Google encourages feedback on these features.
New and Changed Features The following sections describe new and changed features in software release 7.2: •
New GSA Version Manager
•
Admin Console 2.0
•
Onboard Group Resolution
•
Sort Results by Metadata
•
Support for HTTP POST Requests
•
Wildcard Search
•
Language Improvements
•
Trusted Applications
•
New Entity Recognition Features
•
New Crawl Configuration Features
Google Search Appliance: Guide to Software Release 7.2
4
•
Connectors 4.0 (Beta)
•
Connectors 3.2
•
Deprecated Legacy Security
•
Supported Browsers and Third-Party Software
•
Release 7.2 Documentation
•
Specifications and Usage Limits
New GSA Version Manager Release 7.2 introduces a new version manager and changes to the software update process. The new version manager is web-based and provides a series of pages designed to make future software updates easier to perform.
Admin Console 2.0 Release 7.2 presents a complete visual redesign of the Google Search Appliance Admin Console. It now has the clear and usable design that you might be familiar with from other Google products, including Google Apps. As part of the redesign, the sidebar navigation has been reorganized to provide intuitive access to Admin Console pages. The new sidebar presents the following options: •
Content Sources—Lists pages used for web crawl as well as other methods for ingesting content into the search appliance. Also lists diagnostics pages related to content sources.
•
Index—Lists pages used for manipulating the search index, including entity recognition and collections. Also lists diagnostics pages related to the index.
•
Search—Lists pages used for creating the public and secure search experience, including front ends, dynamic navigation, and Universal Login. Also lists diagnostics pages related to search.
•
Reports—Lists pages used for generating logs and reports.
•
GSA Unification—Used for configuring GSA Unification (no change from prior releases).
•
GSA^n—Used for configuring GSA^n (no change from prior releases).
•
Administration—Lists pages used for administering the search appliance (minor changes from prior releases).
Google Search Appliance: Guide to Software Release 7.2
5
Also some of the Admin Console pages have been moved to new locations in the sidebar. If you have used prior releases of Google Search Appliance software, consult the following table for mappings from pre-7.2 Admin Console page locations to new ones. Previous Location
Release 7.2 Location
Home
Same as previous version
Crawl and Index > Crawl URLs
Content Sources > Web Crawl > Start and Block URLs
Crawl and Index > Databases
Content Sources > Databases
Crawl and Index > Feeds
Content Sources > Feeds
Crawl and Index > Crawl Schedule
Content Sources > Web Crawl > Crawl Schedule
Crawl and Index > Crawler Access
Content Sources > Web Crawl > Secure Crawl > Crawler Access
Crawl and Index > Proxy Servers
Content Sources > Web Crawl > Proxy Servers
Crawl and Index > Forms Authentication
Content Sources > Web Crawl > Secure Crawl > Forms Authentication
Crawl and Index > Case-Insensitive Patterns
Content Sources > Web Crawl > Case-Insensitive Patterns
Crawl and Index > HTTP Headers
Content Sources > Web Crawl > HTTP Headers
Crawl and Index > Duplicate Hosts
Content Sources > Web Crawl > Duplicate Hosts
Crawl and Index > Document Dates
Index > Document Dates
Crawl and Index > Host Load Schedule
Content Sources > Web Crawl > Host Load Schedule
Crawl and Index > Coverage Tuning
Content Sources > Web Crawl > Coverage Tuning
Crawl and Index > Freshness Tuning
Content Sources > Web Crawl > Freshness Tuning
Crawl and Index > Collections
Index > Collections
Crawl and Index > Composite Collections
Index > Composite Collections
Crawl and Index > Index Settings
Index > Index Settings
Crawl and Index > Entity Recognition
Index > Entity Recognition
Serving > Front Ends
Search > Search Features > Front Ends
Serving > Front Ends > Output Format
Search > Search Features > Front Ends > Output Format
Serving > Front Ends > KeyMatch
Search > Search Features > Front Ends > KeyMatch
Serving > Front Ends > Related Queries
Search > Search Features > Front Ends > Related Queries
Serving > Front Ends > Filters
Search > Search Features > Front Ends > Filters
Serving > Front Ends > Remove URLs
Search > Search Features > Front Ends > Remove URLs
Serving > Front Ends > OneBox Modules
Search > Search Features > Front Ends > OneBox Modules
Serving > Query Settings
Search > Search Features > Query Settings
Serving > OneBox Modules
Content Sources > OneBox Modules
Google Search Appliance: Guide to Software Release 7.2
6
Previous Location
Release 7.2 Location
Serving > Document Preview Module
Search > Search Features > Document Preview Module
Serving > Result Biasing
Search > Search Features > Result Biasing
Serving > Dynamic Navigation
Search > Search Features > Dynamic Navigation
Serving > Suggestions
Search > Search Features > Suggestions
Serving > Access Control
Search > Secure Search > Access Control
Serving > Head Requestor Deny Rules
Search > Secure Search > Head Requestor Deny Rules
Serving > Policy ACLs
Search > Secure Search > Policy ACLs
Serving > Universal Login
Search > Secure Search > Universal Login
Serving > Universal Login Auth Mechanisms > Cookie
Search > Secure Search > Universal Login Auth Mechanisms > Cookie
Serving > Universal Login Auth Mechanisms > HTTP
Search > Secure Search > Universal Login Auth Mechanisms > HTTP
Serving > Universal Login Auth Mechanisms > Client Certificate
Search > Secure Search > Universal Login Auth Mechanisms > Client Certificate
Serving > Universal Login Auth Mechanisms > Kerberos
Search > Secure Search > Universal Login Auth Mechanisms > Kerberos
Serving > Universal Login Auth Mechanisms > SAML
Search > Secure Search > Universal Login Auth Mechanisms > SAML
Serving > Universal Login Auth Mechanisms > Connectors
Search > Secure Search > Universal Login Auth Mechanisms > Connectors
Serving > Universal Login Auth Mechanisms > LDAP
Search > Secure Search > Universal Login Auth Mechanisms > LDAP
Serving > Universal Login Form Customization
Search > Secure Search > Universal Login Form Customization
Serving > Flexible Authorization
Search > Secure Search > Flexible Authorization
Serving > Alerts
Index > User Alerts
Serving > Language Bundles
Search > Search Features > Language Bundles
Status and Reports > Crawl Status
Content Sources > Diagnostics > Crawl Status
Status and Reports > Crawl Diagnostics
Index > Diagnostics > Index Diagnostics
Status and Reports > Real-time Diagnostics
Content Sources > Diagnostics > Real-time Diagnostics AND Search > Secure Search > Real-time Diagnostics (see bug 8427148)
Status and Reports > Crawl Queue
Content Sources > Diagnostics > Crawl Queue
Status and Reports > Content Statistics
Index > Diagnostics > Content Statistics
Status and Reports > Export URL
Index > Diagnostics > Export URLs
Status and Reports > Serving Status
Search > Diagnostics > Search Status
Status and Reports > System Status
Administration > System Status
Google Search Appliance: Guide to Software Release 7.2
7
Previous Location
Release 7.2 Location
Status and Reports > Serving Logs
Reports > Serving Logs
Status and Reports > Search Reports
Reports > Search Reports
Status and Reports > Search Logs
Reports > Search Logs
Status and Reports > Event Log
Administration > Event Log
Connector Administration > Connector Managers
Content Sources > Connector Managers
Connector Administration > Connectors
Content Sources > Connectors
Social Connect > User Results
Search > Search Features > User Results
Social Connect > Expert Search
Search > Search Features > Expert Search
Cloud Connect > Google Apps
Content Sources > Google Apps
GSA Unification > Host Configuration
Same as previous version
GSA^n > Configuration
Same as previous version
GSA^n > Network Diagnostics
Same as previous version
Administration > System Settings
Same as previous version
Administration > Network Settings
Same as previous version
Administration > User Accounts
Same as previous version
Administration > Login Terms
Same as previous version
Administration > Change Password
Same as previous version
Administration > SNMP Configuration
Same as previous version
Administration > Certificate Authorities
Same as previous version
Administration > DNS Override
Same as previous version
Administration > SSL Settings
Same as previous version
Administration > LDAP Setup
Same as previous version
Administration > License
Same as previous version
Administration > Import/Export
Same as previous version
Administration > Reset Index
Index > Reset Index
Administration > Shutdown
Same as previous version
Administration > Remote Support
Same as previous version
Administration > Support Scripts
Same as previous version
Google Search Appliance: Guide to Software Release 7.2
8
Onboard Group Resolution With release 7.2, you can dramatically reduce the latency for group resolution by periodically feeding groups information to the search appliance. When the groups information is on the search appliance, it is available in the security manager for resolving groups at authentication time. For more information about this feature, see “Feeding Groups to the Search Appliance” in the Feeds Protocol Developer’s Guide.
Sort Results by Metadata Starting in release 7.2, the Google Search Appliance provides to ability to sort results by metadata and entities associated with individual documents. This can be used to sort documents by values like prices, dates, authors, etc. There is no configuration required to enable this feature; the sorting option is activated through a URL parameter on a per-query basis. To learn more, see “Sort by Metadata” in the Search Protocol Reference.
Support for HTTP POST Requests In some instances, a query strings might exceed the 2KB body limit of GET requests and be truncated. This might happen when you submit dynamic navigation queries containing a large number of metadata filters. Starting with release 7.2, you can avoid this limitation by submitting POST requests instead, which have a much larger body limit (10KB). POST support is only available for: •
Requests for search service (/search)
•
Public search
•
Secure search, but only when the Trusted Applications feature is used
POST support is not available for Universal Login Auth Mechanisms. You must use the GET command for these. For detailed information about submitting POST requests, see “Using the POST Command” in the Search Protocol Reference.
Wildcard Search Wildcard search is a new feature that enables your users to search by entering a word pattern rather than the exact spelling of a term. The search appliance supports two wildcard operators: •
*—Matches zero or more characters
•
?—Matches exactly 1 character
Using wildcards can simplify queries for long names, technical data, pharmaceutical information, or strings where the exact spelling varies or is unknown. A user can search for all words starting with a particular pattern, ending with a particular pattern, or having a particular substring pattern.
Google Search Appliance: Guide to Software Release 7.2
9
By default, wildcard indexing is disabled for your search appliance. You can enable or disable wildcard indexing by using the Index > Index Settings page. For information about wildcard indexing, click Admin Console Help > Index > Index Settings. See also “Wildcard Indexing” in Administering Crawl. By default, wildcard search is enabled for each front end of the search appliance. You can disable or enable wildcard search for one or more front ends by using the Filters tab of the Search > Search Features > Front Ends page. For more information about wildcard search, click Admin Console Help > Search > Search Features > Front Ends > Filters. See also “Enabling Wildcard Search” in Creating the Search Experience.
Language Improvements Release 7.2 includes quality improvements to the supported languages and stemming files for additional languages. The additional languages are: •
Danish (DA)
•
Greek (EL)
•
Finnish (FI)
•
Hungarian (HU)
•
Norwegian (NO)
•
Romanian (RO)
•
Turkish (TR)
Trusted Applications Release 7.2 introduces the concept of “trusted applications.” The search appliance enables trusted applications to send end-user's search requests along with pre-validated ids when performing a secure search. The search appliance returns secure results without requiring more validation of the user. Before using this feature, you enable trusted applications and register your application as a “trusted application” on the search appliance. After that, the trusted application can interact with the search appliance. To register a trusted application with the search appliance, use the Search > Secure Search >Trusted Applications page. For information about using this page, click Admin Console Help > Search > Secure Search >Trusted Applications. See also “Using Trusted Applications” in Managing Search for Controlled-Access Content.
Google Search Appliance: Guide to Software Release 7.2
10
New Entity Recognition Features Entity recognition enables the Google Search Appliance to discover interesting entities in documents with missing or poor metadata and store these entities in the search index. Release 7.2 introduces new entity recognition features on the Index > Entity Recognition page, including entity diagnostics and adjustments. Entity diagnostics enables you to test your entity recognition configuration for a specific HTML document. For this document, you can see highlighted entities that the search appliance has extracted from it. If there are issues with extracted entities, you can modify your entity recognition configuration and retest. By testing, you can gain an understanding of your dictionaries and use this understanding to develop the best ones for your corpus. The new Adjustments tab enables expert users to fine-tune entity recognition parameters. To learn more, click Admin Console Help > Index > Entity Recognition. See also “Discovering and Indexing Entities” in Administering Crawl.
New Crawl Configuration Features Release 7.2 introduces new features for configuring crawl on the Content Sources > Web Crawl > Start and Block URLs page. This page gives you the option of working with URLs and patterns in one of two views: •
Action view—provides several features for working with individual URLs and patterns. Use Action view when you want to configure crawl or validate, troubleshoot, recrawl, or test a URL or pattern.
•
Batch Edit view—enables you to work with multiple URLs. Use Batch Edit view when you want to add, edit, or delete multiple URLs and patterns at once.
For more information, click Admin Console Help > Content Sources > Web Crawl > Start and Block URLs. See also “Configuring a Crawl” in Administering Crawl.
Connectors 4.0 (Beta) Beta versions of the following connectors are available with release 7.2: •
SharePoint Connector 4.0
•
SharePoint User Profile Connector 4.0
•
Active Directory Connector 4.0
•
File System Connector 4.0
For information about using 4.0 Beta connectors with release 7.2, see the Connector documentation page in the GSA help center.
Connectors 3.2 Google Search Appliance Connectors Release 3.2 are available for use with release 7.2. For more information about using the 3.2 connectors with release 7.2, see the Connector documentation page in the GSA help center.
Google Search Appliance: Guide to Software Release 7.2
11
Deprecated Legacy Security Beginning with release 7.2, the Google Search Appliance no longer supports legacy authentication and authorization. Legacy authentication was superseded in release 6.2 by the Security Manager and Universal Login, but the search appliance continued to support legacy authentication configurations. In release 7.2, the search appliance no longer supports it. Legacy authorization was superseded in release 6.14, but the search appliance continued to support it through release 7.0. In that release, you had the choice of enabling it or not, and continuing to use legacy authentication. In release 7.2, Flexible Authorization is automatically enabled for the search appliance. If you have been using legacy authentication, migrate to security manager-based authentication (Universal Login) before migrating to 7.2. If you have been using legacy authorization, migrate to flexible authorization before migrating to 7.2. In release 7.2, the security manager's headrequestor does not support using a proxy server. If you are considering migrating from 7.0 to 7.2 and you require a proxy server to handle headrequests, you should not update to release 7.2 at this time.
Release 7.2 Documentation Google Search Appliance documentation is available from the GSA help center as both web pages in HTML format and PDF (Portable Document Format) files that you can download and print. Search appliance troubleshooting and deployment documentation, as well as copies of Admin Console help, are also available from the GSA help center. Please check them out!
Specifications and Usage Limits This release also includes a new document, Specifications and Usage Limits, which lists system limits in GSA. You can use this information to help you plan your deployment and to configure your system for optimal performance.
Google Search Appliance: Guide to Software Release 7.2
12
Supported Browsers and Third-Party Software Google has certified the browsers and third-party products in the following table for use with software release 7.2. Category
Product Name and Version
Browsers
Google Chrome (which automatically updates whenever it detects that a new version of the browser is available) Microsoft Internet Explorer (IE) 8 and above in standard mode Mozilla Firefox current and previous major releases The Admin Console supports Safari 5.1 only; search supports Safari 5.1 and later
Databases
IBM DB2 MS SQL Server 2000, MS SQL Server 2005, and MS SQL Server 2008 MySQL 4.1.13 and 5.0.37. Note: Version 5.0 requires a JDBC upgrade. Sybase Adaptive Server Enterprise (ASE) 15 Express Oracle Database 10g Express Edition on Linux x86 and Windows platforms
JDBC enablers
DB2 Universal Database (UDB) 8.1.0.64 MySQL Connector/J 3.1.13 Microsoft SQL Server 2008, JDBC Driver 2.0 Sybase jConnect for JDBC 5.5, build 25137 Oracle Database 10g Release 2, 10.1.0.2.0 driver
Network infrastructure
HTTP 1.1 HTTPS Network Time Protocol (NTP) servers 3.0 and higher SMB
SSO (Single sign-on)
Computer Associates SiteMinder 6.0, Policy Server and Web Agent Oracle Access Manager 7.0.4 (formerly Oblix) Cams by Cafesoft, version 3.0
Web servers
Apache Netscape Enterprise Microsoft Internet Information Server
XSLT
EXtensible Stylesheet Language Transformations: XSLT 2.0 XML Path Language: XPath 1.0
Google Search Appliance: Guide to Software Release 7.2
13